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Abstract 

We consider the canonical shared link network formed by a source node, hosting a library of m 
information messages (files), connected via a noiseless common link to n destination nodes (users), each 
with a cache of size M files. Users request files at random and independently, according to a given 
a-priori demand distribution q. A coding scheme for this network consists of a caching placement (i.e., 
a mapping of the library files into the user caches) and delivery scheme (i.e., a mapping for the library 
files and user demands into a common multicast codeword) such that, after the codeword transmission, 
all users can retrieve their requested file. The rate of the scheme is defined as the average codeword 
length normalized with respect to the length of one file, where expectation is taken over the random 
user demands. For the same shared link network, in the case of deterministic demands, the optimal 
min-max rate has been characterized within a uniform bound, independent of the network parameters. 
In particular, fractional caching (i.e., storing file segments) and using linear network coding has been 
shown to provide a min-max rate reduction proportional to 1/M with respect to standard schemes such as 
unicasting or “naive” uncoded multicasting. The case of random demands was previously considered by 
applying the same order-optimal min-max scheme separately within groups of files requested with similar 
probability. However, no order-optimal guarantee was provided for random demands under the average 
rate performance criterion. In this paper, we consider the random demand setting and provide general 
achievability and converse results. In particular, we consider a family of schemes that combine random 
fractional caching according to a probability distribution p that depends on the demand distribution q, 
with a linear coded delivery scheme based on chromatic number index coding. For the special but relevant 
case where q is a Zipf distribution with parameter a, we provide a comprehensive characterization of the 
order-optimal rate for all regimes of the system parameters n, m, M, a. We complement our scaling law 
analysis with numerical results that confirm the superiority of our schemes with respect to previously 
proposed schemes for the same setting. 


Index Terms 

Random Caching, Coded Multicasting, Network Coding, Index Coding, Content Distribution, Scaling 


Laws. 
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I. Introduction 

Content distribution services such as video on demand (VoD), catch-up TV, and internet video streaming 
are premier drivers of the exponential traffic growth experienced in today’s wireless networks Q. A key 
feature of this type of services is the time-shifted nature of user requests for the same content, or 
asynchronous content reuse 0: while there exists a relatively small number of popular files that account 
for most of the traffic, users access them at arbitrary times, such that naive multicastin^d& implemented 
in Media Broadcasting Single Frequency Networks (MBSFN) Q, is not useful. In fact, because of the 
large asynchronism of the user demands, present technology (e.g., DASH, Dynamic Adaptive Streaming 
over HTTP Q) employs conventional unicasting, i.e., each user request is treated as an independent 
information message, thus missing the opportunity of exploiting the redundancy of the user demands. 

Due to the increasing cost and scarcity of wireless bandwidth, an emerging and promising approach for 
improving over both naive multicasting and conventional unicasting consists of using storage resources to 
cache popular content directly at the wireless edge, e.g., at small-cell base stations or end user devices]^ 

Caching has been widely studied in several wireline contexts, primarily for web proxy caching systems 
and content distribution networks (CDNs) |[5|-p^. In these works, a range of interrelated problems, 
such as accurate prediction of demand, intelligent content placement, and efficient online replacement, is 
considered. The data placement problem was introduced in Q, where the objective is to find the placement 
of data objects in an arbitrary network with capacity constrained caches, such that the total access cost is 
minimized. It was shown that this problem is a generalization of the metric uncapacitated facility location 
problem and hence is NP-Hard Q. Tractable approaches in terms of LP relaxation 0, 0 or greedy 
algorithms Q, Q have been proposed, by exploiting special assumptions such as network symmetry and 
hierarchical structures. On the other hand, an extensive line of work has addressed the content replacement 
problem, where the objective is to adaptively refresh the cache(s) content while a certain user data request 


process evolves in time |10|-|12|. The most common cache replacement/eviction algorithms are least 
frequently used (LFU) and least recently used (LRU), by which the least frequently/recently used content 
object is evicted upon arrival of a new object to a network cache. A combination of placement and 
replacement algorithms is also possible and in fact used in today’s CDNs, which operate by optimizing 


*Naive multicasting refers to the transmission of a common, not-network-coded stream of data packets, simultaneously received 
and decoded by multiple users. 

^Note that the storage capacity has become exceedingly cheap: for example, a 2 TByte hard disk, enough to store 1000 
movies, costs less than $ 100. 
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the placement of content objects over long time periods for which content popularity can be estimated, 
and using local replacement algorithms to handle short time-scale demand variations. 

In a more recent set of works (a non-exhaustive list of which includes Q, |[T^-|[2^), an information 
theoretic view of caching has provided insights into the fundamental limiting performance of caching 
networks of practical relevance. In this framework, the underlying assumption is that there exists a fixed 
library of m possible information messages (files) and a given nefwork topology fhaf includes nodes 
fhat host a subset of messages (sources), request a subset of messages (users), and/or have constrained 
cache capacity (helpers/caches). The caching phase is part of the code set-up, and consists of filling up 
fhe caches wifh (coding) functions of fhe messages whose enfropy is consfrained to be nol larger fhan 
fhe corresponding cache capacity. After this set-up phase, the network is “used” for an arbitrary long 
time, referred to as the delivery phase. At each request round, a subset of the nodes (users) request 
subsets of the files in fhe library and fhe nefwork musf coordinafe fransmissions such fhaf fhese requesfs 
are satisfied, i.e., at the end of each round all destinations must decode the requested set of files. The 
performance mefric here is fhe number of fime slofs necessary to satisfy all fhe demands. In fhe case of 
symmefric links, fhe number of fime slofs can be normalized by fhe number of limes lols necessary lo 
send a single file across a poinf lo point link. Therefore, the performance metric is rate defined as in fhe 


index coding selling 125|-|34|, i.e., fhe number of equivalenl file fransmissions. 


A. Related work 


Focusing on fhe subset of current works directly relevant to this paper, in 1131 (see also the successively 
published papers | [T4| , | [T5| ) a bipartite network formed by helper nodes (with caches), user nodes (without 
caches), and capacitated noiseless links, was studied in the case of random i.i.d. requests according to 
some known demand distribution. This is a special case of the data placement problem with trivial routing 


. The problem in 113|-115| consists of minimizing the average rate necessary to satisfy all users. 


where averaging is over the random requests. 

In | [T^ , | [T7| , the data placement problem is generalized to the (coded) content distribution problem 
(CDP), where information can, not only be stored and routed, but also coded, over the network. The 
authors showed an equivalence between the CDP and the network coding problem over a so-called 
caching-demand augmented graph, which proved the polynomial solvability of the CDP under uniform 
demands (each user requests the same subset of files), and fhe hardness of fhe CDP under arbifrary 
demands. The aufhors furfher showed via simulations on simple networks, the potential of network 
coding to enable cache cooperation gains between caches sharing a multicast link from a content source. 
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While this work suggested the henefit of cooperative caching via both the direct exchange of information 
between neighbor nodes as well as via coded multicast transmissions from a common source, the analytical 
characterization of the optimal caching performance in arbitrary networks remains a hard and open 
problem. To this end, significant progress has been made by considering specific network models that 
capture scenarios of practical relevance, especially in wireless networks. 

In Q, the authors considered a Device-to-Device (D2D) network with caching nodes that are at the 
same time helpers and users, communicating with each other under the interference avoidance “protocol 
model” of | |3^ . In this setting, under i.i.d. random requests following a Zipf distribution | [T2| , |37| (with 
Zipf parameter a < 1), it was shown that decentralized random caching and uncode<^ unicast delivery 
achieves order-optimal average per-user throughputshown to scale as 0 (^)j^when both the number 
of users n and the library size m grow large and satisfy nM > m (i.e., the aggregate cache across the 
network can contain the whole file library). 


Concurrently, another line of work in 1181, 1191 considered a different network topology, here referred 
to as the shared link network. This is formed by a single source node (a server or base station) with all m 


files, connected via a shared noiseless link to n user nodes, each with cache of size M files. In | |T8l , | fT9| |, 
the authors addressed the min-max rate problem, i.e., minimizing (over the coding scheme) the worst-case 
rate (over the user demands). Both deterministic and random caching schemes, with corresponding coded 
multicast delivery schemes, were shown to provide approximately optimal min-max rate, i.e., within a 
multiplicative constant, independent of n, m, M, from an information theoretic lower bound. Interestingly, 
when translating the results of p8| , |19| in terms of per-user throughput, for the case nM > m this 
scales also as 0 (^). 

The ensemble of these results show the remarkable fact that, both for the D2D and for the shared link 
network topologies, caching in the user devices can turn memory into bandwidth'. Moore’s law (scaling 
of silicon integration) reflects directly in terms of a per-user throughput gain, in the sense that doubling 


^We refer to “uncoded” the schemes that send packets of individual files, in contrast to “coded” schemes that send mixtures 
of packets from different files (inter-session network coding). 

''The per-user throughput expressed in bits per time slot is inversely proportional to the rate expressed in number of equivalent 
file transmissions, through a system constant that is irrelevant as far as scaling laws are concerned. Hence, in this context, 
maximizing throughput or minimizing rate are equivalent goals. 

^We will use the following standard order notation: given two functions / and g, we say that: 1) /(n) = O {g{n)) if 
there exists a constant c and integer N such that /(n) < cg{n) for n > M; 2) /(n) = o{g{n)) if lim„^oo = 0; 3) 
/(n) = H {g{n)) if g{n) = O (/(n)); 4) /(n) = w {g{n)) if g{n) = o (/(n)); 5) /(n) = 6 {g{n)) if /(n) = O {g{n)) and 
g{n) = 0(/(n)). 








5 


the user device storage capacity M yields seamlessly a two-fold increase in the per-user throughput. 
The D2D approach of Q exploits the spatial reuse of D2D local communication, since caching allows 
each user to access the desired content within a short range. Instead, the approach of |18|, | [T^ exploits 
the multiplexing gain of global communication, creating network-coded symbols that are simultaneously 
useful to a large number of users. In an effort of combining these two gains, ED considered the same 
D2D wireless network of Q, with a caching and coded delivery scheme inspired by 1181, 1191. Somehow 
counterintuitively, it was shown that spatial reuse and the coded multicasting gains are not cumulative. 
An informal explanation of this fact follows by observing that D2D spatial reuse and coded multicasting 
have contrasting goals. On one hand, spatial reuse is maximized by keeping communication as “local” 
as possible, such that the same time slot can be reused with high density in space. On the other hand, 
coded multicasting produces codewords that are useful to many users, so that it is advantageous to have 
(coded) transmissions as “global” as possible. 


While several variants and extensions of these basic setups have been recently considered |20|, |23|, 
1 - 0 . in this work we focus on the combination of the random requests aspect (as in Q, | [T3| , | [T^ , 
fTT) ) and the single source shared link network (as in 118|, |[T^). This problem has been treated in |20|, 
which considered a strategy based on partitioning the file library into subsets of approximately uniform 


request probability, and applying to each subset the strategy for the min-max approach of |19|. This is 
motivated by observing that the average rate with random uniform demands is related, within a constant 
factor, to the min-max rate under arbitrary demands. Then, by partitioning the set of files and allocating 
the cache memory across such subsets, the problem is decomposed into subproblems, each of which 
can be separately addressed by reusing the arbitrary demand strategy. Due to the difficulty of finding 
fhe optimal file partitioning and corresponding cache memory allocation, p0| restricts its analysis to a 
scheme in which for any two files in the same partition, the file popularities differ by at most a factor 
of two. 


B. Contributions 


While in |20| this approach is studied for a general demand distribution, our scaling order-optimality 
results apply to the specific case of a Zipf demand distribufion. This is a very relevant case in practice 


since the popularity of Internet content has been shown, experimentally, to follow a Zipf power law 112|, 
ITT) (or its variations |48|) defined as follows: a file / = 1,... ,m is requested with probability 

/ 


Qf = 


E m 

i=i* 


T-r, V/ = {I,--- ,m}, 


( 1 ) 
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where a > 0 is the Zipf parameter. In this context, our ohjective is to characterize the scaling laws 
of the optimal average rate and provide simple and explicit order-optimal schemes. Specifically, the 
contrihutions of this work are as follows: 

1) We recognize that the suh-optimality of the scheme analyzed in [lOl is due to the fact that 
files are partifioned according to their local popularity without considering the effects of the 
remaining system parameters (n, m, M) on the ’’aggregate user demand distrihution”. In particular, 
the prohahility with which each user requests files can he very different from the prohahility with 


which each file is requested hy the aggregate users. The other limitation of 1201 is that the scheme 


for coded delivery (see details in Section IIIi is applied separately for each file group, resulting 


in missed coding opportunities between different groups. We propose a different way to optimize 
the random caching placement, according to a caching distrihution that depends on all system 
parameters, and not just the “local” demand distrihution q. Also, we consider “chromatic number” 
index coding delivery scheme applied to all requested packets. We refer to this scheme as RAndom 
Popularity-based (RAP) caching, with Chromatic-number Index Coding (CIC). 

2) For the proposed RAP-CIC, we provide a new upper bound on the achievable rate by bounding 
the average chromatic number of the induced random conflict graph. By numerically optimizing 
this bound, we demonstrate the efficiency of our mefhod and the gains over the method of p0[ by 
simulation. However, a direct analysis of the proposed scheme appears to be elusive. 

3) For the sake of analytical tractability, we further focus on a simpler caching placement scheme 
where the caching distribution is a step function (some files are cached with uniform probability, 
and others are not cached at all) and a polynomial-time approximation of CIC, referred to as 
greedy constrained coloring (GCC). We refer to this scheme as Random Least-Frequently-Used 
(RLFU) caching, given its analogy with the standard LFU caching policy]^ with GCC delivery, or 
RLFU-GCC. 

4) We provide an information theoretic lower bound on the (average) rate achieved by any caching 
scheme, and show the order-optimality of the proposed achievability schemes for the special case 
of a Zipf demand distribution. To the best of our knowledge, these are the first order-optimal 
results under this network model for nontrivial popularity distributions. In addition, our technique 
for proving the converse is not restricted to the Zipf distribution, such that it can be used to verify 


®LFU discards the least frequently requested file upon the arrival of a new file to a full cache of size M files. In the long 
run, this is equivalent to caching the M most popular files. 
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average rate order-optimality in other cases. 

5) Our analysis identifies the regions in which conventional schemes (such as LFU with naive mul¬ 
ticasting) can still preserve order-optimality, as well as exposes the wide range of opportunities 
for performance improvements via RAP or RLFU, combined with CIC or GCC. We show that, as 
in the D2D setting of Q, when the Zipf parameter is 0 < a < 1, the average rate with random 
demands and the min-max rate with arbitrary demands are order-equivalent. On the other hand, 
when a > 1, the average rate can exhibit order gains with the respect to the min-max rate. 


We remark that while we consider simultaneous requests, the argument made in |18|, |19| to handle 
streaming sessions formed by multiple successive requests starting at different times holds here as well. 
Finally, it is interesting to note that, while RLFU-GCC becomes a special case of the general scheme 


described in |20|, the optimization carried out in this paper and the corresponding performance analysis 


are new and non-trivial extensions. As pointed out in |20|, one would think that an approach based 
on uniformly caching only the in < m most popular files has fhe disadvantage that “... the difference 
in the popularities among the m cached files is ignored. Since these files can have widely different 
popularities, it is wasteful to dedicate the same fraction of memory to each one of them. a result, this 
approach does not perform well in general.” In contrast, we show that for the Zipf case, the approach 
is order-optimal provided that the cache threshold rh is carefully optimized as a function of the system 


parameters. Also, in |20| it was also pointed out that “Another option is to dedicate a different amount 
of memory to each file in the placement phase. For example the amount of allocated memory could be 
proportional to the popularity of a file. While this option takes the different file popularities into account, 
it breaks the symmetry of the content placement. A^ a result, the delivery phase becomes intractable 
and the rate cannot be quantified ...” In contrast, using the proposed RAP caching optimization and 
CIC delivery across all requested packets, it is possible to find schemes thaf significantly outperform 
previous heuristics and, again for the Zipf case, are provably order-optimal for all regimes of the system 
parameters n, m, M, a. 

The paper is organized as follows. In Section [ITj we present the network model and the problem 
formulation. The random caching and coded multicasting scheme is introduced in Section and a 
general converse result for the achievable average rate is given in Section IV In Section |Vj we prove 
and discuss the order-optimality of the proposed scheme for the Zipf request distribution. Further results. 


simulations and conclusive remarks are presented in Section pH and VII 









II. Network Model and Problem Formulation 


Consider a shared link network |(T^-pO| with file library T = {1, • • • ,m}, where each file (i.e., 
message) has entropy equal to F bits, and user set ^ = {1, • • • ,n}, where each user has a cache (storage 
memory) of capacity MF bits. Without loss of generality, the files are represented by binary vectors 
Wf G F^. The system setup is as follows: 

1) At the beginning of time, a realization {Wj : f & F} of the library is revealed to the encoder. 


TmF 


F, 


MF 


2) The encoder computes the cached content, by using a set of \U\ functions : F 2 ^ ii -'2 

u such that Zu{{Wf : / G J^}) denotes the codeword stored in the cache of user u. The 

operation of computing {Zu '■ u G U} and filling fhe caches does not cost any rate, i.e., it is done 
once for all at the network setup, referred to as the caching phase. 

3) After the caching phase, the network is repeatedly used. At each use of the network, a realization 
of the random request vector f = (fi,...,f„) G is generated. We assume that f has i.i.d. 
components distributed according to a probability mass function q = (qi,..., qm), referred to as 
the demand distribution. This is known a priori and, without loss of generality up to index reodering, 
has non-increasing components qi > ■ ■ ■ > qm- 

4) We let f = (/i,..., fn) denote the realization of the random request vector f. This is revealed to 
the encoder, which computes a multicast codeword as a function of the library files and the request 
vector In this work we consider a fixed-to-variable almost-lossless framework. Hence, the multicast 
encoder is defined by a fixed-to-variable encoding function X : ^mF X jun ^ (where F^ 
denotes the set of finite length binary sequences), such that X{{Wf : f G -F'},f) is the transmitted 
codeword. We denote by L{{Wf : f G -Fjjf) the length function (in binary symbols) associated 
to the encoding function X. 

5) Each user receives X{{Wf : f G F}, f) through the noiseless shared link, and decodes its requested 

file as ITy^ = \u{X, Zu,f), where : F 2 x F^^ x —)■ F^ denotes the decoding function 


of user u. 

6) The concatenation of 1) demand vector generation, 2) multicast encoding and transmission over 
the shared link and 3) decoding, is referred to as the delivery phase. 

Consistently with the existing information theoretic literature on caching networks (see Section [I]), we 
refer to a content distribution scheme, formed by both caching and delivery phases, directly as a caching 
scheme, and measure the system performance in terms of the rate during the delivery phase only. In 
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particular, we define the rate of the scheme as 

Ri->= sup p, 

where the expectation is with respect to the random request vector]^ 

This definition of rate has the following operational meaning. Assume that the download of a single 
file through the shared link takes one “unit of time”. Then, Q denotes the worst-case (over the library) 
average (over the demands) download time for the whole network, when the users place i.i.d. random 
requests according to the demand distribution q. The underlying assumption is that the content library 
(i.e., the realization of the files) changes very slowly in lime, such that it is generated or refreshed at a 
time scale much slower than the time scale at which the users download the files. Hence, it is meaningful 
to focus only on the rate of the delivery phase, and disregard the cost of filling fhe caches (i.e., fhe cosf 
of the caching phase), which is included in the code construction. Users make requests, and the network 
satisfies fhem by sending a variable lengfh transmission until every user can successfully decode. After 
all users have decoded, a new round of requests is made. This forms a renewal process where the 
recurrent event is the event that all users have decoded their files. In the spirit of fixed-to-variable source 
coding, is the coding rate (normalized coding length) expressed in file “units of time”. Also, by 
the renewal theorem, it follows that yields (up to some fixed proporfionalify factor) the channel 

throughput in terms of per-user decoded bits per unit time. Finally, since the content library changes very 
slowly, averaging also over the realization of the files has liftle operafional meaning. Insfead, we lake the 
worst-case over the file library realization. 

Consider a sequence of caching schemes defined by cache encoding functions {Z^}, multicast coding 
function X, and decoding functions {A^}, for increasing file size F = 1,2,3,.... For each F, the 
worst-case (over the file library) probabilify of error of fhe corresponding caching scheme is defined as 

Pf)({Z4,X,{A4) = sup P ( M |A4X,Z„,f) ^wA). (3) 

A sequence of caching schemes is called admissible if limj7’_>.oo Pe {{Zu},X, {A^}) = 0. Achievability 
for our system is defined as follows: 

Definition 1: A rate P(n, m, M, q) is achievable for the shared link network with n users, library 
size m, cache capacity M, and demand distribution q, if there exists a sequence of admissible caching 

^Throughout this paper, we directly use “rate” to refer to the average rate defined by and explicitly use “average (expected) 
rate” if needed for clarity. 
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schemes with rate such that 


< R{n,m, M,q). 

F^OO 


0 


We let R*{n,m, M,q) denote the infimum (over all caching schemes) of the achievable rates. The 
notion of “order-optimality” for our system is defined as follows: 

Definition 2: Let n, M he functions of m, such that limm^oo n{m) = oo. A sequence of caching 
schemes for the shared link network with n users, library size m, cache capacity M, and demand 
distribution q, is order-optimal if its rate R{n, m, M, q) satisfies 


R{n,m,M,q) 

lim sup -r < F, 


(4) 


0 


oo (required by the definition of 


R*{n, m, M, q) 

for some constant 1 < v < oo, independent of m, n, M. 

Notice that in the definition of order-optimality we let first F 
achievable rate) and then we let m —)• oo. In this second limit, we let n and M be functions of m, 
indicating that the notion of “order”, throughout this paper, is with respect to the library size m. Depending 
on how n and/or M vary with respect to m, we can identify different system operating regimes}^ 

III. Random Fractional Caching and Linear Index Coding Delivery 

In this section we focus on a particular class of admissible schemes where the caching functions {Zu} 
are random and independent across the users Q, 1^ and the multicast encoder is based on linear index 
coding 1251, With random coding functions, two flavors of results are possible: 1) by considering 


the average rate with respect to the random coding ensemble, one can prove the existence of deterministic 
sequences of caching schemes achieving rate not worse than average; 2) by considering the concentration 
of the rate conditioned on the random caching functions, we obtain a stronger result: namely, in the limit 
of large file size F, the (random) rate is smaller than a given threshold with high probability. This implies 
achievability of such rate threshold by the random scheme itself (not only in terms of a non-constructive 
existence argument based on random coding). Here, we prove achievability in this second (stronger) 
sense. 


*The case of constant m while n —>■ oo is also considered and treated separately in Section 


VI 
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A. Random Fractional Caching Placement 

The caching placement phase works as follows: 

1) For some given integer B, each file fFj is divided into packets of equal size F/B hits, denoted 
hy {Wpb:b = l,...,B}f\ 

2) Each user randomly selects and stores in its cache a collection of pjMB distinct packets from 
each file f £ IF, where p = (pi,... ,Pm) is a vector wifh componenfs 0 < pj < 1/M, such fhaf 
Y/y^=iPf — i-’ referred to as fhe caching distribution^^ 

If follows fhaf, for each user u, 




(5) 


where is fhe index of fhe i-fh packef of file / cached hy user u, and where fhe fuples of indices 
{bj are chosen independenfly across fhe users u £U and fhe files f £ F, wifh uniform 

prohahilify over all disfinct subsets of size pfMB of the set of packets of size B. The collection 

of the cached packet indices (over all users and all files) is a random vecfor, denofed in fhe following 
hy C. For lafer use, a given cache configuralion, i.e., a realization of C, will be denofed by C. Also, 
we shall denofe by fhe vecfor of indices of fhe packefs of file / cached by user u. Finally, for fhe 
sake of nofafion simplicify, we shall nol disfinguish befween “vectors” (ordered lisfs of elemenfs) and 
fhe corresponding “sefs” (unordered lisfs of elemenfs), such fhaf we wrife b ^ Cuj (resp., b £ C„j) to 
indicafe fhaf fhe 6-fh packef of file / is nol presenl (resp., presenl) in fhe cache of user u. Observe fhaf 
a random fraclional caching scheme is completely characferized by fhe caching disfribufion p, where pj 
denofes fhe fraclion of file / cached by each user. In Secfion III-D[ we shall describe how to design fhe 
caching disfribufion as a function of fhe system parameters. 


B. Linear Index Coding Delivery 

Finding a delivery scheme for fhe caching problem in fhe shared link nefwork is equivalenf to finding 
an index code wifh side information given by fhe cache configuralion C. If is clear fhaf under fhe 
caching funcfions defined before, each user u requesling file /„ needs lo oblain all fhe packefs 
wifh b ^ Cu,/„. It follows that a demand vector f, given the cache configuration C, can be translated 

*Since we eventually take the limit for F —> oo, for simplicity we neglect the fact that B may not divide F. 

'°Note that pf represents the fraction of the memory M allocated to file /. Hence, we let pf be a function of n,m, M,q, 


but not a function of B. 
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into a packet-level demand vector Q, containing the packets needed by each user. Symmetrically with 
the notation introduced for the cache configuration, we denote by Q the corresponding random vector, 
and by Quj the packet-level demand restricted to user u and file /. In parficular, if user u requesfs file 
fu, then Quj is empty for all f ^ fu and it contains the complement set of for / = /„. 

Following the vast literature on index coding (see for example p5[ , | [2^ , 1^), we define fhe side- 
informafion graph 5c ,q corresponding fo fhe index coding problem defined by (C, Q) as follows: 

• Vertices of 5c, q: For each packef in Q (i.e., requesfed by some user), form all possible disfincf 
labels of fhe form v = {packet identity, user requesting, users caching}, where “packet identity” is 
the pair (/, b) of file index and packet index, “user requesting” is the index of some user u such 
that b G Quj, and “users caching” is the set of all users u' such that b G Cu'j- Then, a vertex in 
5c,Q is associated to each of such distinct labels. For simplicity of notation, we do not distinguish 
between label and vertex, and refer to “label v” or “vertex v” interchangeably, depending on the 
context. Notice that while “packet identity” and “users caching” are fixed by fhe packef identify and 
by the cache realization, the second label component (“user requesting”) can take multiple values, 
since several users may request the same packet. 

• Edges of 5c, q: For each vertex v, p{v), /j,{v) and r]{v) denote the three fields in ifs label (namely, 
“packet identity”, “user requesting”, and “users caching”). Any two vertices vi and V 2 are connected 
by an edge if at least one of the following two conditions are satisfied: 1 ) p{vi) = p{v 2 ), or 2 ) 
p{vi) G r]{v 2 ) and p,{v 2 ) G pivi). 

The complemenf graph 'Hc,q of the side information graph 5c ,q is known as the conflict graph. In 
particular, 'Hc,q has the same vertices of 5c ,q and any two vertices vi and V 2 in T-Lc.q are connected 
by an edge if both the following conditions are satisfied: 1 ) p{vi) / p{v 2 ), and 2 ) pi{vi) ^ p{v 2 ) or 
p{v2) i r/(ui). 

Example 1: We consider a nefwork wifh n = 3 users denofed as U = {1,2,3} and m = 3 files 
denoted as F = {A, B, C}. We assume M = 1 and partition each file into B = 3 packets. For example, 
A = {Ai, A 2 , A 3 }. Let p = {|, h, 0}, which means that two packets of A, one packet of B and none of 
C will be stored in each user’s cache. We assume a caching realization C is given by: Ci,a = {Ai, A 2 }, 
Ci,B = {Bi}, Ci,c = 0; C 2 ,a = {AijAs}, C 2 ,b = {B 2 }, C 2 ,c = 0; Cs^a = {Ai, A 2 }, Cs^b = {B 3 }, 
C 3 ,c = 0- Suppose that user 1 request A, user 2 request B and user 3 request C (f = {A, B, C}), such 
that Q = {A 3 , Bi, B 3 , Cl, C 2 , C 3 }. The corresponding conflict graph T(c,q is shown in Fig. 0 

A well-known general index coding scheme consists of coloring the vertices of the conflict graph 'Hc,q 
and transmitting the concatenation of the packets obtained by EXOR-ing the packets corresponding to 
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Fig. 1. An illustration of the conflict graph, where n = 3, W = {1, 2, 3}, m = 3, J- = {A, B, C} and M = 1. Each file is 
partitioned into 3 packets. The caching realization C and the packet-level demand vectors are given in Example [T] The color 
for each vertex in this graph represents the vertex coloring scheme obtained by Algorithm [T] In this case, this vertex coloring 
is the minimum vertex coloring, and therefore it achieves the graph chromatic number. 


vertices with same color. For any vertex coloring of the conflict graph, vertices with the same color form 

(hy definition) an independent set. Hence, the corresponding packets can he EXOR-ed together and sent 

over the shared link at the cost of the transmission of a single packet. 

Letting x(^c,q) denote the chromatic number of ?fc,Q^ the corresponding normalized code length is 

Li{Wf.fGT},{) _ x('Hc.q) 

F B ' ^ ’ 


since each coded packet corresponds to F/B coded binary symbols, and we have a total of x(^c,q) 
coded packets (i.e., colors). For ease of reference, we denote this coding scheme as Chromatic-number 
Index Coding (CIC). In passing, we observe that, by design, the CIC scheme allows coding over the full 


set of requested packets Q, unlike the scheme proposed in |20|, where coding is allowed within packets 
of specific file groups. Notice also that, by construction, CIC allows all users to decode their requested 
packets. Therefore, any sequence of CIC schemes yields probability of error identically zero, for all file 
lengths F, and not just vanishing probability of error in the limit. Hence, while in our problem definition 
we have considered a fixed-to-variable length “almost-lossless” coding framework, this class of algebraic 
coding schemes are fixed-to-variable length exactly lossless. 


The graph coloring problem is NP-complete and hard to approximate in general \A9\. However, it is 
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clear from the above presentation that any coloring scheme (possibly using a larger number of colors), 
yields a lossless caching scheme with possibly larger coding length. In particular, exploiting the special 
structure of the conflict graph originated by the caching problem, we present in the following an algorithm 
referred to as Greedy Constrained Coloring (GCC), which has polynomial-time complexity of n, B and 
achieves asymptotically smaller or equal rate with respect to the exponentially complex greedy coloring 


algorithm proposed in 119| for the case of arbitrary demands. The proposed GCC is the composition of 
two sub-schemes, referred to as GCCi and GCC 2 , given in Algorithms [T] and respectively. Eventually, 
GCC chooses the coloring with the smallest number of colors between the outputs of GCCi and GCC 2 
(i.e., the shortest codeword). 


Algorithm 1 GCCi 
1 : Initialize V =Vertex-set(T(c,Q)- 

2 : while V / 0 do 

3: Pick any i; G V, and let I = { 1 ;}. 

4 : for all v' €VjI do 

5: if {There is no edge between v' and X } n { {/i(n'),??(n')} = {n{v),r]{v)} } then 

6: I = IU{v'}. 

7: end if 

8 : end for 

9: Color all the vertices of the resulting set X by an unused color. 

10 : Let V ^ V \ X. 

11: end while 


Notice that {p(n), ri{v)} denotes the (unordered) set of the users either requesting or caching the packet 
corresponding to vertex v. Notice also that each set X produced by Algorithm [T] is an independent set 
containing vertices with the same set of users either requesting or caching the corresponding packets. In 
fact, starting from a “root” node v among those not yet selected by the algorithm, the corresponding set 
X is formed by all the independent vertices v' such that {fi{v'),r]{v')} = {fj,{v),ri{v)}. 

It is also worthwhile to notice that GCC 2 is nothing else than “naive multicasting”, that we have 
included here, for the sake of completeness, in a form symmetric to that of GCCi. In fact, it produces 
a set X (and a color) for each requested packet, which is then transmitted (uncoded) and simultaneously 
received by all requesting users. 

We can see that both the outer “while-loop” starting at line 2 and the inner “for-loop” starting at line 
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Algorithm 2 GCC 2 
1 : Initialize V =Vertex-set('?^c,Q)- 

2 : while V / 0 do 

3: Pick any i; G V, and let X = {?;}. 

4: for all i;' G V/X do 

5: if p{v') = p{v) then 

6: X = XU {?;'}. 

7: end if 

8 : end for 

9: Color all the vertices of the resulting set X hy an unused color. 

10: Let V ^ V \ X. 

11: end while 


4 of Algorithm [T] iterate at most nB times, respectively. The operation in line 5 of Algorithm [T] costs at 
most complexity n. Therefore, the complexity of Algorithmj^is (polynomial in n and B). Also, 

it is easy to see that this complexity dominates that of Algorithm Therefore, the overall complexity of 
GCC is 0{v?B‘^). 


C. Achievable Rate 


As anticipated before, we shall consider the concentration of the (random) rate of the scheme descrihed 
above, where the randomness follows from the fact that the caching functions, and therefore the conflict 
graph, are random. It is clear from the delivery phase construction that the output length of CIC or GCC 
does not depend on {Wf : f G X"} but only on C and Q (see ([^), since the conflict graph is determined 
by the realization of the caches and of the demands. Therefore, without loss of generality, we can treat 
the files as fixed arbitrary binary vectors and disregard the sup over {Wf : f £ T} in the rate definition 
(see Given n,m, M, the demand distribution q and the caching distribution p, we define 




(V) 


to be the conditional rate achieved by CIC. Similarly, we let m, M, q, p) denote the conditional 

rate achieved by GCC, defined by (|7]) after replacing the chromatic number with the number of colors 
produced by GCC. The same definition applies to (n, m, M, q, p) and (n, m, M, q, p). In 

the following, it is intended that we consider the limit of the CIC and GCC schemes for F, B ^ 00 with 
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fixed packet size F/B —)> constant. The performance of the proposed caching schemes is given hy the 
following result. 

Theorem 1: For the shared link network with n users, library size m, cache capacity M, and demand 
distribution q, fix a caching distribution p. Then, for all e > 0, 

lim P m, M, q, p) < min{'!/^(q, p), rh} + e) = 1, (8) 

where 

m 

= (9) 

/=1 

and where 

n y \ m 

V’Cq,?) = (”) (10) 

e=i ^ ^ /=i 

with 

Pf,i = = argmc^ , (11) 

where P is a random set of £ elements selected in an i.i.d. manner from F (with replacement). 

Proof: See Appendix ■ 

Remarks: 

1) Since by construction m, M, q, p) < M,ci,p) for any realization of C,Q, 

then also R^^^{n, m, M, q, p) stochastically dominates R^^^{n, m, M, q, p). Therefore, Theorem 
[^immediately implies limi7’_5.oo P {R^^^{n, m, M, q, p) < min{'i/)(q, p), m} + e) = 1. 

2) As mentioned earlier, both R^^^{n,m, M,q,p) and R^^^{n,m, M,ci,p) are functions of the 
random caching placement C. Hence, not only E[i?^'^'^(n, m, M, q, p)] < min{^/)(q, p), m}, but 
also the (conditional) rate R^^^{n,m, M,q^,p) concentrates its probability mass all to the left of 
the bound min{'0(q, p), m}, in the limit of F,B ^ oo and F/B —)• constant. This means that in 
the large file limit, choosing a configuration of the caches that “misbehaves”, i.e., that yields a rate 
larger than the bound of Theorem [T] is an event of vanishing probability. 

3) The achievable rate in Theorem [T] is given by the minimum between two terms. The first term, 

y)(q, p), follows from the analysis of GCCi given in Appendix [A| In particular, we show that 
limi?_ 5 .oo P (n, m, M, q, p) — '(/>(q, p)| < e) =1, which means that R^^^^ {n, m, M, q, p) 

concentrates its probability mass at V’(q, p). as F, H —)■ oo and F/B —)■ constant. The second 
term, m, is simply the average number of distinct requested files, which is a natural upper bound 
of the average number (normalized by B) of distinct requested (uncached) packets. As 
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will be shown later, after careful design of the caching distribution p, the only case in which 
?fi < '0(q) p) is in regimes of very small M, in which caching is shown to provide no order gains 
with respect to non-caching approaches such as conventional unicasting or naive multicasting of all 
requested files. Morover, in this regime, fh becomes a tight upper bound of Accordingly, 

we disregard e and the fact that ([^ involves a limit for F —)■ cx) and identify the rate achieved by 
GCC directly as m, M, q, p) = min{?/i(q, p), ?7i}. 

4) The events underlying the probabilities py £, defined in fhe statement of TheoremjTj can be illustrated 

as follows. Let 2? be a random vector obtained by selecting in an i.i.d. fashion i elements from 
R with probability q. Notice that V may contain repeated entries. By construction, F{V = 
(/i, •••,/£)) = nt Then, py£ is the probability that the element in V which maximizes 
the quantity (pyM)^“^(l — is /. 

5) For the sake of the numerical evaluation of '(/’(B) p)> it is worthwhile to note that the probabilities 
can be easily computed as follows. Let J, Ji,..., denote 1 + 1 i.i.d. random variables distributed 
over R with same pmf q, and define (for simplicify of nofation) gi{j) = (pyM)^“^(l — 

Since • • • ,ge.{Ji>) are i.i.d., the CDF of = max{p£(Ji), • • • ,gi{J(.)} is given by 

P (y, < 2/) = (P {gt{J) < y)Y =1 gy j . (12) 

Hence, it follows that 


py,, = P(y, = pK/)) 


E 






E 




\j&J^. 9 e{j)<ge{f) 


(13) 


which can be easily computed by sorting the values {ge(j) : j G F}. 


D. Random Caching Optimization 

Driven by Theorem [T] we propose to use as caching distribution the one that minimizes the rate 

R^^^(n, m, M, q, p), i.e., 

p* = argmin min{'!/^(q, p), m}, (14) 

p-.Pf<i/M,RfPf=i 

where '4>{p,q) is given by ( [T0| ) with in ( [T3] ) and fh is given by (|^. In the following, we refer to 
random caching according to the distribution p* as RAndom Popularity-based (RAP) caching placement. 
Consequently, the caching schemes with RAP placement and CIC or CCC delivery will be referred to 
as RAP-CIC and RAP-CCC, respectively. 
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The distribution p* resulting from ( [T4| ) may not have an analytically tractable expression in general. 
This makes a direct analysis of the performance of RAP-CIC and RAP-GCC difficult, if not impossible. 
To this end, in the following we also consider a simplified caching placemenf according fo fhe truncated 
uniform distribution p defined by: 

Pf = f <fh 
m 

Pf = 0, / > m + 1 (15) 


where fhe cuf-off index m > M is a funcfion of fhe sysfem paramefers. 

The form of p in ( [Ts] ) is intuitive: each user caches the same fraction of (randomly selected) packets 
from each of the most m popular files and does nol cache any packef from fhe remaining m — m 
leasf popular files. If m = M, fhis caching placemen! coincides wifh fhe leas! frequenfly used (LFU) 


caching policy |50|. For fhis reason, we refer fo this caching placement as Random LFU (RLFU), and 
the corresponding caching schemes as RLFU-CIC and RLFU-GCC. For later analysis purposes, we shall 
use a simplified upper bound on fhe rafe of RLFU-GCC given by fhe following corollary of Theorem [T] 
Lemma 1: For any e > 0, fhe rate achieved by RLFU-GCC satisfies 


lim P m, M, q, p) < min |'0(q5 m), m| -|- =1, 


where 


= ( — - 1 


m 



nGf, 


-|- n(l — Gffi), 


(16) 


(17) 


wifh Gm = 7/’ where fh is defined in (91. 

Proof: See Appendix ■ 

For convenience, we disregard e and fhe fact that ( [T^ involves a limit for F —)> oo and refer directly 
to the achievable rate upper bound as 


72“*’(n, m, M, q, m) = min |'0(q, m), m| 


(18) 


where it is understood that the upper bound holds with high probability, as F —)■ oo. 

While RLFU-GCC is generally inferior to RAP-GCC, we shall show in Section |V] that RLFU-GCC 
is sufficient to achieve order-optimal rate when q is a Zipf distribution. In order to further shed light on 
the relative merits of the various approaches, in Section |V^ we shall compare them in terms of actual 
rates (not just scaling laws), obtained by simulation. 
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IV. Rate Lower Bound 

In order to analyze the order-optimality of RLFU-GCC, we shall compare R^^{n, m, M, q, m) with a 
rate lower hound on the optimal achievable rate R*{n, m, M, q). This is given hy: 

Theorem 2: Th rate R{n, m, M, q) of any admissible scheme for the shared link network with n users, 
library size m, cache capacity M, and demand distribution q must satisfy 


R{n, m, M, q) > R^^{n, m, M, q) 


= max < Pi{£,r)P2{i,r,'¥) max _ z{l — M/\_£/z\)l{z,r > 1 }, 

i,r,z 

Pi{i,r)P2{£,l,I){l-M/£)l{Te (0,1)}|. (19) 

where ^ G {1,... ,r?T,}, r G M_|_ with r < n£qi and z G M+ with z < min{r, £ (l — (l — y)^)}, and 
where 


Pi{£,r) = 1 - exp - 


{n£qi - r) 
2n£qe 


and 


P 2 {£,r,T) = 1 - exp - 


2£(i-(i-in 


( 20 ) 


( 21 ) 


Proof: See Appendix 


V. Order-optimality 

The focus of this section is to prove the order-optimality of the RLFU-GCC scheme introduced 
in Section III-D when q is a Zipf distribution (see 0)- Using Lemma [T and Theorem we shall 
consider the ratio i?“’^(n, m, M, q, m)/i?^’^(n, m, M, q) for n,m ^ oo, where m —)• oo and n,M are 
functions of m as in Definition 2 According to Definition RLFU-GCC is order-optimal if the ratio 
R^^{n, m, M, q, m)/ii^*’(n, m, M, q) is uniformly bounded for all sufficiently large m. Obviously, order- 
optimality of RLFU-GCC implies order-optimality of all “better” schemes, employing the optimized RAP 
distribution and/or CIC coded delivery. 

We shall also compare the (order optimal) rate achieved by RLFU-GCC with the rate achieved by 
other possibly suboptimal schemes, such as conventional LFU caching with naive multicasting!^ and the 


"when M is not explicitly given in terms of m, it means that the corresponding scaling law holds for M equal to any 
arbitrary functions of m, including M constant, as a particular case. 

'^Recall that with conventional LFU every user caches the M most popular files and hence there are no coded multicast 
opportunities. In fact, it is straight forward to show that if we fix the placement scheme to (conventional) LFU in the shared 
link network, the best delivery scheme is naive multicasting. 
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scheme designed for arbitrary demands, achieving the order-optimal min-max rate |18|, |19|. We shall 
say that a scheme A has an order gain with respect to another scheme B if the rate achieved by A is 
o(-) of the rate achieved by B. We shall say that a scheme A has a constant gain with respect to another 
scheme B if the rate of A is 0(-) of the rate of B, and their ratio converges to some k < 1 as m —oo. 
In addition, we shall say that some scheme A exhibits a multiplicative caching gain if its rate is inversely 
proportional to an increasing function of M. Specifically, we say that the multiplicative caching gain is 
sub-linear, linear, or super-linear if such function is sub-linear, linear, or super-linear in M, respectively. 

We notice that the behavior of the Zipf distribution is fundamentally different in the two regions of 
the Zipf parameter 0 < a < 1 and a > l{^In fact, for a < 1, as m —)> oo, the probability mass is 
“all in the tail”, i.e., the probability Qf most probable m files vanishes, for any finile m. In 

confrasf, for a > 1, fhe probabilify mass is “all in fhe head”, i.e., for sufficienfly large (finile) fh, Ihe sef 
of mosl probable fh files contain almost all the probability mass, irrespectively of how large the library 
size m is. In the following, we consider the two cases separately. 


A. Case 0 < a < 1 
In this case, we have: 

Theorem 3: For the shared link network with n users, library size m, cache capacity M, and ran¬ 
dom requests following a Zipf distribution q with parameter 0 < a < 1, RLFU-GCC with fh = m 
yields order-optimal rate. The corresponding (order-optimal) achievable rate upper bound is given by 
m, M, q, m) = min { (^ - l) (l - (l - ^)’") , fh}. 

Proof: See Appendix ■ 

RLFU with fh = m corresponds to caching packets at random, independently across users, with uniform 
distribution across all files in fhe library. Nol surprisingly, fhe order-optimal rafe given by Theorem 
is order-equivalenf fo fhe min-max rale under delerminislic demands | [T9| . We will refer lo RLFU wilh 
fh = m also as uniform placement (UP) and to UP-GCC as the scheme with UP as caching placement and 
GCC as delivery scheme. Intuitively, this result is due to the heavy tail property of the Zipf distribution 
with 0 < a < 1 such that, in this case, the random demands are approximately uniform over the whole 
library and, from Lemma in Appendix [C| we know that the average rate under uniform random demands 
is order-equivalent to the min-max rate under arbitrary demands. 

*^The regime a = 1 requires not more difficult but somehow different analysis because of the bounding of the Zipf distribution 
(see Lemma 1 in Q). For the sake of brevity, given the fact that the analysis is already quite heavy, also motivated by the fact 
that most experimental data on content demands show a f 1 |12|, in this paper, we omit this case. 
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Nevertheless, making use of the knowledge of the Zipf parameter may yield fairly large constant rate 


gains, especially for a close to 1. In particular, we can optimize the parameter rh as follows. Define 
H{a, X, y) = consider the hounds on the tail of the Zipf distribution given hy the following 

lemma, proved in Q: 

Lemma 2.' If a 7 ^ 1, then 




1 — a 


" < H{a, X, y) < 


1 — a 


1 — 0 . 


1 — a" 


1 i-« 1 

x^ “H-. 


1 — a 




( 22 ) 


□ 


Notice that for the Zipf distribution with parameter a, the term in Lemma [T] is written explicitly as 


_ H(a,l,m) 

m. — 


m Then, using Lemma in Lemma we can write 

M,q,m) < '0(q, m) 


m 


' M ^ j I ^ r m) 

1 — 0^ 




+ n 1 - 


H(a, 1, m) 
H(a, 1, m) 


m 


m 


(“) 

- * M Im 


where (a) follows from the fact that 








n 


< ( — - 1 
M 


(23) 


(24) 


and 


H(l, a, m) 
Tf(l, a, m) 


> 


m,m^oo 


1 - 


1—0 ^ ' 1—C 




1 - 1 - 


c 

1 — 

m 


n. 


(25) 


Minimizing the upper bound given by (231 with respect to m, subject to M < fh < m, and treating 
m as a continuous variable, we obtain 

'n(l — a)M'' 


m = mm < max 


m 


m, M > ,m > . 


(26) 


Fig. 1^ shows the significant gains that can be achieved by using RLFU-GCC with optimized fh (as given 
by ( |26l )) compared to UP-GCC for a network with m = 50000, n = 50 and a = 0.9. For example, given 
a target rate of 20, UP-GCC requires a cache capacity M 2000, whereas RLFU-GCC with optimized 
fh requires only M ss 800. 
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Fig. 2. Rate versus cache size for UP-GCC and RLFU-GCC with optimized m (see (|26)), for n = 50, m = 50000, and Zipf 
parameter a = 0.9. 


B. Case a > 1 

This case is more intricate and we need to consider different sub-cases depending on how the number 
of users scales with the library size: namely, we distinguish the cases of n = to (m“), n = 0 (m“), and 
n = o (m“). 

1) Regime of “very large” number of users: n = lo [mR): 

Theorem 4: For the shared link network with library size m, cache capacity M, random requests 
following a Zipf distribution q with parameter a > 1, if m —)■ cx) and the number of users scales as 
n = uj (m“), UP-GCC (i.e., fh = m) achieves order-optimal rate. 

Proof: Theorem 1^ can be proved by following the steps of the proof of Theorem in Appendix [Pj 
This is omitted for brevity. ■ 

2) Regime of “large” number of users re = 0 {mP): 

Theorem 5: For the shared link network with library size m, cache capacity M, random requests 
following a Zipf distribution q with parameter a > 1, if m —)• cx) and the number of users scales 
as re = 0 (m“), RLFU-GCC achieves order-optimal rate with the values of m given in Table for 
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different sub-cases of the system parameters. The corresponding (order-optimal) achievable rate upper 
bound R^^{n, m, M, q, fh) is also provided in Table 

Proof: See Appendix ■ 



M 

m 

m, M, q, m) 

P >1 

M > 0 

m 


0 < p < 1 

0 < M < 1 

1 

p^m 

2p“ m — 1 + 0 (m) 

1 < M < - 
P 

pR M^m 

2pem / m \ 

1 — 1 1 

Al 

m 

m / Tn\ 

m“^ + Km) 


TABLE I 

Order-optimal choice of fh and the corresponding achievable rate upper bound m , M , q, fh ) for 

RLFU-GCC WITH a > 1 and n = 0 (m“). 


Without loss of generality, we let the leading term of n in terms of m to be pmP for some p > 0, i.e., 
n = pmP + o{mP) for m —>• oo. Hence, we distinguish the following case: 

• For p > 1, the network behaves similarly to the case of n = w(m"), and UP-GCC achieves order- 
optimal rate, which scales as 0 for M > 1. 

• For 0 < p < 1, we distinguish three regimes of M, namely, 0 < M < 1, 1 < M < i, and M > i. 

~ L L L 

The corresponding order-optimal value of m varies from p^^m, via p^ Me m to m. This corresponds 
to the order-optimal caching placement varying from RLFU to UP. Correspondingly, the scaling law 
of the rate varies from 0 via 0 ^ to 0 (^), where the multiplicative caching gain 

of RLFU-GCC (with order optimal m) varies from sub-linear to linear. 

3) Regime of “small-to-moderate” number of users n = o{m°‘): 

Theorem 6: For the shared link network with library size m, cache capacity M, random requests 
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following a Zipf distribution q with parameter a > 1, if m —)> oo and the number of users scales as 
n = o (m“), RLFU-GCC achieves order-optimal rate with the values of in given in Tables |T^and|l^ for 
different sub-cases of the system parameters. The corresponding (order-optimal) achievable rate upper 
bound R^^{n, m, M, q, in) is also provided in the Tables and 
Proof: See Appendices and 


III 


M 

m 

i?“’^(n, m, M, q, m) 

0 < M < 1 

1 

2n“ -h 0 

7 . ^ 

1 < M < - 

n 

1 ^ 71 ^ / 1 \ 

M = o{m) 




1 < M = 0 1 n“-i j 

M = Q{m) = Kim 

m 

m / Tn\ 

1 

see Table III 

OJ j = M < - 

M = o{m) 

M 

\ o( ) 

M“-1 ' 

M = 0(m) = Kim 

M 

( n ^ \ /-ix 

M> - 

n 

m 

. ( m ^ f 'm\ l 


TABLE II 

Order-optimal choice of m and the corresponding achievable rate upper bound m, M, q, m) for 

RLFU-GCC WITH a > 1 AND n = O (m“). Here, 0 < ah < 1 INDICATES A fixed positive constant. 


Since Theorem [^contains several regimes, it is useful to discuss separately some noteworthy behaviors. 
We start by consider the case of n = for which there are two relevant regimes of M (see Table 

[n| ), namely, 0 < M < 1 and 1 < M < In particular: 

• If 0 < M < 1, the achievable rate upper bound is 2n^/“. This rate scaling can also be achieved by 
using naive multicasting for all the requested files from the file set { 1 , • • • ,n“| and conventional 
unicasting for the requested files from the remaining set {n^ -|- I,-- - ,m}. It is not difficult to 
show (details are omitted) that the average number of distinct files requested from the set {n^ + 
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M 


R'^°{n, m, M, q, m) 


1<M < 
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M = Q (n ‘ 


K < 1 


M = o{m] 
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2n^ , 
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K > 


rj{a - 1) 


M or 
Mini 
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TABLE III 

FOR THE SAME REGIME OF TABLE 


II 


Here, K , Ki,rj indicate fixed 


POSITIVE CONSTANTS, AND /i IS AN ARBITRARY POSITIVE CONSTANT LARGER THAN 2. 


1, • • • ,m} is n° + o{n^). Hence, both the naive multicasting and the conventional unicasting of 

the requested files from the respective sets require rate equal to in the leading order, such 

that the concatenation of the two delivery schemes requires rate 2n^/". In order to achieve this 
(order-optimal) rate scaling, caching is not needed at all. We conclude that, in this regime of “small 
storage capacity” (M < 1), caching does not achieve any significant gain over the simple non¬ 
caching strategy described above, based on combining naive multicasting for the most popular files 
and convenfional unicasfing of fhe remaining less popular files. 

• For fhe case 1 < M < we nofice fhaf fhe assumpfion n = yields ^ Hence, 

fhe constraint M < ^ is dominated by the obvious condition M < m, which always holds by 

definitionp] Considering that n = implies that ra = io{n'^), we distinguish the following 

'"^If M >m, then each user can cache the whole library and the rate is trivially zero. 
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three sub-cases: 1 < M = o (n°-i ), M = 0 (n“-i ), and w (n“-M = M < m. We now discuss 

II 


while we refer the reader to Table 


III 


in more details the two regimes l<M = on“-i and u ] = M < m shown in Table 


for the case M = 0 (n“-i ) ■ In the case 1 < M = o (n“- 




or, equivalently, ^) = n < if M = o(m), then the order-optimal RLFU parameter is 


777, = M c n o. In this case, the rate is 0 




which exhibits order gain with respect to the rate 


obtained with UP, given by 0 (min | m, n}). This also shows that the order-optimal average rate 


in this regime yields an order gain with respect to the min-max order-optimal rate |18|, |19|. 

We interpret this order gain as the benefit due to caching according to popularity. Intuitively, when 
a > 1 and the number of users is not very large (n = o(m"“^)), only a limited number of files are 
requested with non-vanishing probability. Meanwhile, 1 < M = o and n = o (m““^) imply 

that the cache capacity is M = o(m), i.e., only a sublinear number of files can be cached. Hence, 
it is critically important to be able to focus on the files that deserve to be cached. We conclude 
that, in this case, caching according to the knowledge of the demand distribution makes a significant 
difference (in fact, a difference in the rate scaling order) with respect to UP. 

In the other regime, oj = M < m, the cache size M can be large. In this case, LFU 

(obtained by letting fh = M) combined with the naive multicasting of the uncached requested files 
achieves order-optimal rate, which scales as 0 (see Table [h]). Again, this rate exhibits an 

order gain with respect to the min-max order-optimal rate. 

Intuitively, this is due to the fact that, in this case, users request relatively few files, most of which are 
the popular ones. Since the storage capacity is large, then LFU caching covers most of the requests 
and the source node only needs to serve the unpopular requests, which account for a vanishing rate 
(0 {jfhrr) = o(l)). In addition, we observe that the multiplicative caching gain becomes super- 
linear for a > 2. 

Then, we examine the case of o(m") = n = uj (m°~^)[^ where the number of users is relatively large. 
The relevant regimes of M (see Table [n|) in this case areO<M<l, 1<M< and ^ < M < m. 
Keeping the order of n in m fixed and increasing the order of M in m, the order-optimal m varies from 
n° via to m, indicating that the caching placement converges to UP instead of LFU, in contrast 

to the case of n = o (m"“^) considered before. This shows that in this regime, with the exception of the 
“small storage capacity” regime M < 1, LFU with naive multicasting fails to achieve order-optimality. 


*^We do not discuss the case of n = 0 (m“ ^) for the sake of brevity and ease of presentation. The corresponding result 
can be found in Table jlij and 0 
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In addition, as the order of M in m increases, the scaling law of the rate varies from 0 (n<» ), via 


0 




to 0 (^). This indicates that the multiplicative caching gain goes from suh-linear to linear. 


C. Remark 

We finally remark that in this paper, for the sake of presentation clarity, we let n he a function of m 
such that n —)> oo as m —)• oo. However, when m is a constant independent of n, and n oo, the ratio 
between m, M, q, rh) and R}^{n, m, M, q) is also upper hounded hy a constant, which is shown 

hy the following corollary. 

Corollary 1: For the shared link network with n users, a library of constant size m, cache capacity 
M, and random requests following a Zipf distribution q with parameter a > 0, UP-GCC achieves 
order-optimal with the following gap guarantee: 


12 

lim sup - -rr-, -^- < 


R^^{n, m, M, q) 
for some arbitrarily small e > 0, independent of m, n, M. 
Proof: See Appendix [H| 


1-e’ 


(27) 


VI. Discussions and Simulation Results 

In Section [v| we have seen that, under a Zipf demand distribution, RLFU-GCC with fh given in Tables 
and achieves order-optimal rate and so do RLFU-CIC, RAP-GCC and RAP-CIC. In all these 
schemes, once the cache configuration is given, the delivery phase reduces to an index coding problem. 
Despite the fact that for general index coding no graph coloring scheme is known to be sufficient to 


guarantee order-optimality |32|, for the specific problem at hand we have the pleasing result that GIG 
and even the simpler GCC are sufficient for order optimality. 

While this result is proved by considering the RLFU-GCC scheme, for the sake of analytical simplicity, 
one would like to directly use RAP-GCC or, better, RAP-CIC, to achieve some further gain in terms of 
actual rate, beyond the scaling law. While the minimization of min{'!/^(q, p), m} given in ^ with respect 
to p is a non-convex problem without a appealing structure, it is possible to use brute-force search or 
branch and bound methods pl| to search for good choices of the caching distribution p. Fig. shows p* 
obtained by numerical minimization of the bound min{V’(q, p), m} for a toy case with m = 3, M = 1, 
n = 3, 5,10,15 and demand distribution q = [0.7, 0.21, 0.09]. We observe how the caching distribution 
p*, which does not necessarily coincide with q, adjusts according to the system parameters to balance 
the local caching and coded multicasting gains. In particular, p* goes from caching the most popular 
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files (as in LFU) for n = 3 to UP for n = 15. Recall from Theorems m that the optimized p follows 
this same trend, going from LFU (m = M) to UP (m = m) as n increases, while constrained to he 
a step function. This, perhaps surprising, behavior arises from the fact that even if the “local” demand 
distribution q is fixed, when the number of users increases, the “aggregate” demand distribution, i.e., 
the probability that a file gets requested at least by one user, flattens. This effectively uniformizes the 
“multicast weight” of each file, requiring caching distributions that flatten accordingly. 

The corresponding achievable rate, m, M, q, p*) with p* given by ( fTT] ), is shown in Fig. 

confirming the performance improvement provided by RAP-GCC. For comparison. Fig. also shows the 
rates achieved by RLFU-GCC, with m = 1,2, and 3. 



Fig. 3. The optimal caching distribution p* for a network, where m = 3, M = 1 and n = 3,5, 10,15 and the demand 
distribution is q = [0.7,0.21,0.09]. 

As discussed in Theorems [4]j^ the fact that the caching distribution adjusts to changes in all system 
parameters, and not just the demand distribution q, is a key aspect of our order-optimal schemes and one 
of the reasons for which previously proposed schemes have failed to provide order-optimal guarantees. 

In addition, unlike uncoded delivery schemes that transmit each non-cached packet separately, or the 
scheme suggested in p0| , where files are grouped into subsets and coding is performed within each 
subset, another key aspect of the order-optimal schemes presented in this paper is the fact that coding 
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Fig. 4. m, M, q, p) for different caching distributions p and for a network with m = 3, M — 1 and n = 3,5, 10,15 

and demand distribution q = [0.7, 0.21, 0.09]. Note that p = [1/3,1/3,1/3], p = [0.5, 0.5, 0], and p = [1, 0,0] correspond to 
RLFU with m = 3, in = 2, and rh — 1, respectively. 


is allowed within the entire set of requested packets. When treating different subsets of files separately, 
missed coding opportunities can significantly degrade efficiency of coded mulficasfing. 

For example, in fhe setting of Fig. wifh M = 1.5 and n = 20, hy following fhe recipe given in 
J^, Pleach of fhe m = 3 files becomes a separafe group, delivered independenfly of each ofher, yielding 
an expected rate of 1.5, which can also be achieved by conventional LFU with naive multicasting. On 
the other hand, for this same setting, RLFU-GCC uses a uniform caching distribution and GCC over all 
requested packets, yielding a rate of 0.5. 

In Figs. 1^ and we plot the rate achieved by RLFU-GCC, given by m, M, q, p) with 

fh = argmin m, M, q, m), (28) 

which can be computed via simple one-dimensional search. For comparison. Figs. and also show 
the rate achieved by: 1) UP-GCC (i.e., letting m = m); 2) LFU with naive multicasting (LFU-NM), 


°The achievable rate for the scheme proposed in 


20 


is computed based on a grouping of the files, an optimization of the 


memory assigned to each group, and a separate coded transmission scheme for each group, as described in |20|. 
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Fig. 5. Simulation results for a = 0.6. a) m = 5000, n = 50. b) m = 5000, n = 500. c) m = 5000, n = 5000. 
d) m = 500, n = 5000. RLFU in this figure corresponds to the RLFU with optimized in given by l|28|>. 


given by J2J=m+i 3) grouping scheme analyzed in |20|, which is referred to 

as “reference scheme” (RS). The expected rate is shown as a function of the per-user cache capacity M 
for n = {50,500,5000}, m = {500,5000}, and a = {0.6,1.6}. The simulation results agree with the 
scaling law analysis presented in Section |V] In particular, we observe that, for all scenarios simulated in 
Figs. and RLFU-GCC is able to significantly outperform both LFU-NM and RS. For example, when 


a = 1.6, m = 500 and n = 5000, Fig. 6(d) shows that for a cache size equal to just 4% of the library 


(M = 20), the proposed scheme achieves a factor improvement in expected rate of 5x with respect to 
the reference scheme and 8x with respect to LFU-NM. Interestingly, we notice that the reference scheme 
(RS) of |20| often yields rate worse than UP-GCC, a scheme that does not exploit the knowledge of the 
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Fig. 6. Simulation results for a = 1.6. a) m — 5000, n 
d) m = 500, n = 5000. RLFU in this figure corresponds to tl 



(d) 

= 50. b) m = 5000, n = 500. c) m = 5000, n = 5000. 
; RLFU with optimized fh given by l|28|>. 


demand distribution. 

Computing the chromatic number of a general graph is an NP-hard problem and difficult to approximate 


|49|. However, for specific graphs (e.g., Erdos-Renyi random graphs G{n,p)), the chromatic number can 
be approximated or even computed |[52}-|[55|. In our case, by using the property of the conflict graph 
resulting from RAP or RLFU, we have shown that the polynomial time (O(re^H^)) greedy constrained 
coloring (GCC) algorithm can achieve the upper bound of the expected rate given in Theorem [T] Finally, 


we remark that, as recently shown in |56|, |57|, especially when operating in finite-length regimes 


{B finite), one can design improved greedy coloring algorithms that, with the same polynomial-time 
complexity, further exploit the structure of the conflict graph and the optimized RAP caching distribution 
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to provide significant rate improvements. This is confirmed by the simulation shown in Fig. |7j where in 
addition to RLFU-GCC, UP-GCC, and LFU-NM, we also plot the rate achieved by RAP-HgC, where 


HgC is the Hierarchical greedy Coloring algorithm proposed in |57|, for a network with (a) m = n = 5, 
(b) m = n = 8, a = 0.6, and a finite number of packets per file B = 500. 




(a) 


(b) 


Fig. 7. Two examples of the simulated expected rate by using RAP-FIgC {B = 500). The comparison includes UP-GCC 
(B —>■ oo), RLFU-GCC (73 —>■ oo and rh given by l|28t), LFU-NM (B —> oo). In these simulations, a = 0.6. a) m = n = 5. 


b) m = n = 8. 


VII. Conclusions 

In fhis paper, we built on the shared link network with caching, coded delivery, and random demands. 


firstly considered in 1201. We formally defined the problem in an information theoretic sense, giving an 
operational meaning to the per-user rate averaged over the random demands. We analyzed achievability 
schemes based on random fractional (packet level) caching and Chromatic-number Index Coding (CIC) 
delivery, where the latter is defined on a properly constructed conflict graph that involves all the requested 
packets. In particular, any suboptimal (e.g., greedy) technique for coloring such conflict graph yields an 
achievable rate. Our bound (Theorem [T]) considers a particular delivery scheme that we refer to as Greedy 
Constrained Coloring (GCC), which is polynomial in the system parameters. The direct optimization of 
the bound with respect to the caching distribution yields a caching placement scheme that we refer to as 
RAndom Popularity-based (RAP). For analytical convenience, we also considered a simpler choice of the 
caching distribution, where caching is performed with uniform probability up to an optimized file index 
cul-off value m, and no packefs of files with index larger than rh are cached. This placement scheme is 
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referred to as Random Least Frequencly Used (RLFU), for the obvious resemblance with conventional 
LFU caching. We also provided a general rate lower bound (Theorem [^. 

Then, by analyzing the achievable rate of RLFU-GCC and comparing it with the general rate lower 
bound, we could establish the order-optimality of the proposed schemes in the case of Zipf demand 
distribution, where order-optimality indicates that the ratio between the achievable rate and the best 
possible rate is upper bounded by a constant as m, n —)■ oo (with the special case of m fixed and n —)■ oo 
treated apart). 

Beyond the optimal rate scaling laws, we showed the effectiveness of the general RAP-CIC approach 
with respect to: 1) conventional non-caching approaches such as unicasting or naive multicasting (the 
default solution in today’s wireless networks); 2) local caching policies, such as LFU, with naive 
multicasting; and 3) a specific embodiment of the general scheme proposed in | p0| , which consists 
of splitting the library into subsets of files with approximately the same demand probability (in fact, 
differing at most by a factor of two) and then applying greedy coloring index coding separately to the 
different subsets. 

Our scaling results, while seemingly rather cumbersome, point out that the relation between the rate 
scaling and the various system parameters, even restricting to the case of Zipf demand distribution, can be 
very intricate and non-trivial. In particular, we characterized the regimes in which caching is useless (i.e., 
it provides no order gain with respect to conventional non-caching approaches), as well as the regimes in 
which caching exhibits multiplicative gains, i.e., the rate decreases (throughput increases) proportionally 
to a function of the per-user cache size M. Specifically, we identified the regions where the multiplicative 
caching gain is either linear or non-linear in M, and how it depends on the Zipf parameter a. Finally, for 
the regimes in which caching can provide multiplicative gains, we characterized 1) the regions in which 
the order-optimal RAP-CIC converges to conventional LFU with naive multicasting, showing when the 
additional coding complexity is not required, and 2) the regions in which (cooperative) fractional caching 
and index coding delivery is required for order-optimality. 
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Appendix A 
Proof of Theorem[T] 


Let J'(C,Q) denote the (random) number of independent sets found by Algorithm [T] applied to the 


conflict graph 'Hc,Q defined in Section III where C is the random cache configuration resulting from the 
random caching scheme with caching distribution p, and Q is the packet-level demand vector resulting 
from the random i.i.d. requests with demand distribution q. 

Recall that we consider the limit for F, i? —)• oo with fixed packet size F/B. Then, since the term fh 
in Q has been already shown to upper bound the average rate due to GCC 2 (see Remark 3 in Section 
III-C I, Theorem [T] follows by showing that 

/E[J(C,Q)|C] 


lim P 


■ 


B 


< '0(q,p) +e = 1, 


(29) 


for any arbitrarily small e > 0. 

By construction, the independent sets X generated by GCCi have the same (unordered) label of users 
requesting or caching the packets {p{v) : v G I}. We shall refer to such unordered label of users as the 
user label of the independent set. Hence, we count the independent sets by enumerating all possible user 
labels, and upperbounding how many independent sets X Algorithm [T] generates for each user label. 

Consider a user label lAg FTA oi size i, and let denote the number of independent sets 

generated by Algorithmwith label {p,{v),r]{v)} = Ui. A necessary condition for the existence of an 
independent set with user label is that, for any user u G Ui, there exist a node v such that: 1) p,{v) = u 
(user u requests the packet corresponding to v), and 2) r]{v) =U(, \ {u} (the packet corresponding to v 
is cached in all users \ {u} and not cached by any other user). Therefore, the following equality holds 
with probability 1 (pointwise dominance) 


Jc,q{Ui) = max V 1 {ri{v) =Ui\ {u}} 

u&Je ^' 


(30) 




In (301, with a slight abuse of notation, we denote the condition that the packet p{v) associated to node 
V is requested by user u as p{v) 3 f^, indicating that the “file” field in the packet identifier p{v) is equal 
to the rt-th component of the (random) request vector f. The indicator function captures the necessary 
condition for the existence of an independent set with user label Uf, expressed (in words) above, and the 
maximum over u G is necessary to obtain an upper bound. Notice that summing over u instead 
of taking the maximum would overcount the number of independent sets and yield a loose bound. 
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Then, using (301 and the definition of J'(C,Q), we can write 

C 


E[J(C,Q)|C] = E 


= E 


£=1 UtOU 


Y Y ^{u}} 

i=l UeCU “ ' v.p{v)3f^ 


c 


(n^/. 11 I] ^\ 

^i=lUi^U v:p{v)^fu 


(31) 


fe^- \j=i 


i=i 

n 


EC) Em®. 

\j=i 


^max^ l{77(n) = {!,...,£} \{d}} j (32) 

v.p{v)Bdu 


EC E 

de:^* \i=i 


e=i 


Y^\f = argmax Y ^ {^(^) = {1> • • ■ > ^} \ {«}} 

feT [ ^ v-.p{v)3j 

■ ihW = {i....,<)\{<.)} 

v.p{v)3f 


(33) 

(34) 


where (311 follows hy writing the conditional expectation with respect to the demand vector explicitly 
in terms of a sum over all possible files, affer recognizing fhaf fhe indicafor funcfion 

l{ 7 y(r;) =Ui\{u}} 

is a random variable only function of the cache placement C (in fact, this depends only on whether the 


i — 1 users in Ui \ {tt} have cached or not the packet associated to node v), and where (32 1 follows by 
noticing that the term 

Y ^ {u}} 

u^Ue. ^^ 

depends only on the £ (possibly repeated) indices {fu : u G Up. Therefore, after switching the summation 
order and marginalizing with respect to all the file indices corresponding to the requests of the users not 
in Ue, due to the symmetry of the random caching placement and the demand distribution (i.i.d. across 
the users) we can focus on a generic user label of size £, which without loss of generality can be set to 
be {1,... At this point, the sum with respect to Ui U reduces to enumerating all the subsets of 


size £ in the user set of size n, yielding the binomial coefficient (”). Finally, (34 1 follows from replacing 
the max with a sum over all possible file indices, and multiplying by the indicator function that picks 
the maximum. 












36 


At this point, we need to study the behavior of the random variable 

Yr,/= = (35) 

v:p(v)5f 

where u G and where, by construction, the sum extends to the nodes corresponding to file / 

requested by user u, i.e., not present in its cache. By construction of the caching scheme, these nodes are 
Furthermore, the random variable 1 {r]{v) = {1, \ {u}} takes value 1 with probability 

{pfMY~^{l — pfM)^~^, corresponding to the fact that i—\ users have cached packet p(v) and n — £ 
users have not cached it (user u has not cached it by construction, i.e., we are conditioning on this event). 
However, they are not i.i.d. across different v. By denoting Pij = (pyM)^“^(l —pfM)'^~^, we can see 
that 

E[Y^j]=E l{7?(u) = {!,...,£} \ Ml =B{l-pfM)P,j, (36) 

y:p{v)5f 

Then, for p{v),p{v') 3 /, 

P(l{?7(r;) = {!,...,£} \{u}} = l,l{?7(u') = {1,..., ^ \ {u}} = l) 

= PejF (1 {p{v') = {1,... ,4 \ M} = l\lMv) = {1,... ,4 \ Ml = 1) 

= Pej{p'fMY-\l - p)MY-Y (37) 


where 


[pfMB-2) PfM-l 

( B-1 N =Pf- B-1 


and 5{B) —)■ 0 as H —)■ oo independently with pf. Let P^ ^ = {p'jMY ^(1 — p'^Mp then we obtain 
plf = ip'fMY-\i-p'fMr-^ 


= {{pf + 6 {B))mY-\ 1 - {pf + 5{B))Mr-^ 

= Pi,f+^ ■5{B) + oi5iB)) 
dPf Pf=P', 

= Pij+ ({£ - l){pfMY~^{l - PfMp-^M - {n - e){pfMY~\l - PfMp-^-A M ■ 5{B) 
V / Pf=p'f 

+o{6{B)) 


= Pi,f + 5'iB), 


(38) 
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where 5'{B) —)• 0 as i? —)• oo independently with P^j. Then, we have 


E[Yjj] = E 


2n 


v:p(v)3f 

E E E 

v.p{v)3f v:p{v)Bfv'-.p(v')Bf,v'^v 

E ^'./+ E E PijPij+Ptjs'iB) 


v:p{v)3f 


v:p{v)Bf v':p{v')Bf,v'y^v 


E P‘j + E E P’jP’j + E E PiPiP) 

yV:p{v)Bf v:p{v)Bfv':p{v')Bf,v'^v j v.p{v)Bfv':p{v')Bf,v'^v 


Pu + Y Y PijPu I + 

yv:p{v)Bf v:p{v)Bf v':p(v')Bf,v'^v 


ns) 

Pij 


Y Y p^jp^j 

v:p{v)Bf v':p(v')Bf,v'^v 


= B{1 -pfM)P,j{l - p,j) + (i?(l - PfM)fPjj + o {B‘^) ((1 - PfM)fPij, (39) 

Therefore, hy using the fact that Var(Y£j) = E[Y|j] — E[Yfj]^ and from Chehyshev’s inequality we 
obtain we have thaf^ 


Y, 


(J 


{pfMY-\l-pfM) 


B{1 — pfM) 

Equivalently, we can write 


n—£ 


for B —)• cx). 


lim P 
B^oo 


Y 




B 


- 9i{f) 


< e = 1, 


(40) 


for any arbitrarily small e > 0, where we define the function (already introduced in Remark 5 in Section 
IlEC] ), 

ge{f) = {pfMY-\l-pfMr-^+\ (41) 


It follows that we can replace the last line of (34i by the bound (holding with high probability) B{gi{f) + 
e). In order to handle the indicator function in (331, we need to consider the concentration (401 of 


around the values giif). Sorting the values {gi{f) : f £ P} in increasing order, we obtain a grid 
of at most m discrete values. The limit in probability (40) states that the random variables Y^jjB 


concentrate around their corresponding values gi{j), for any j £ T. Taking e sufficiently small, the 
intervals [gi{j) — €,g£{j) + e] for different j are mutually disjoint for different values of j, unless there 
are some j ^ j' such that g(,{j) = gi{j')- For the moment we assume that all these values are distinct, and 


'Here we need only convergence in probability. 
*As usual, A indicates limit in probability |58|. 
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we handle the case of non-distinct values at the end (we will see that this does not cause any prohlem). 


Now, we re-write the indicator function in (331 as 


l</ = argrnax ^ 1 {?/('(;) = {1,\ {u}} i = 1 |/= argrnaxY^j/sj 

{ ^ v:p(v)^j J ^ 


and compare it with the indicator function 


1 <! / = argrnax5£(j)| . 


(42) 


(43) 


If / ^ d then both indicator functions are equal to 0. If / G d, suppose that / = argmax^gd 


such that (431 is equal to 1. Then, (42i is equal to 0 only if for some j G d ■. j / > 


Yej/B. Since Ye^j/B G [geij) - e,gi{j) + e] and Yej/B G [geif) - €,geif) + e] with high prohahility, 
and, hy construction, gi{j) + e < ge{f) — e, it follows that this event has vanishing prohahility as 
-B —> oo. Similarly, suppose that / G d and that / ^ argmax^gd <?£(j) such that (43) is equal to 0. 


Then, ^ is equal to 1 only if Yij/B > where jmax = argmaxjgd 5r(j)- Again, since 

Yq//B G [geif) - e,geif) + e] and Yej^^^^/B G [pKjmax) - e,£(£(jmax) + e] with high prohahility, and, 
hy construction, geif) + e < geijma.x) — it follows that this event has vanishing prohahility as B —oo. 
We conclude that 


lim P 

B^oo 



argmaxY£j/B I - 1 J / = argmax 5 r£(j) \ 
fed J ( jed J 



(44) 


Since convergence in prohahility implies convergence in the r-th mean for uniformly absolutely bounded 
random variables ||58| and indicator functions are obviously bounded by 1, we conclude that 


E 


1 <i / = argmaxY^j/B 
jev 


E 


1 <! / = argmaxgeij) 

J6X> 


as B —)• oo, where B is a random subset of i elements sampled i.i.d. (with replacement) from T with 


probability mass function q. Now, replacing the last line of (34) with the deterministic bound Bigeif)+€) 
(which holds with high probability as explained before) and taking expectation of the indicator function 


using the convergence of the mean said above, we can continue the chain of inequalities after (34) and 
show that the bound 


£ E (l) E •*(/ = +.) (45) 

holds with high probability for B ^ oo, for any arbitrary e > 0. In the case where for some distinct j, j' 
the corresponding values of geif) and geif') coincide, we notice that outcome of the indicator functions 
([42]) and ([43]) are irrelevant to the value of the bound, as long as they pick different indices which yield 
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the same maximum value of the function §(.{■)■ Hence, the argument can he extended to this case hy 
defining “equivalent classes” of indices which yields the same value in the hound. 

Theorem now follows hy using ( [dl] ) and hy noticing that the prohahilities P(/ = argmaxjgx) ge{j)) 
coincide with the terms pf^e defined in HU 


Appendix B 
Proof of Lemma[T] 

Applying Theorem [T] to the case p = p, we have that, for all e > 0, 


lim P m, M, q, p) < min{?/)(q, p), m} + e) = 1. 

F^oo ^ ^ 


Then, we can write 


n / \ m 

V’Cq,?) = 

1=1 ^ ^ f=i 


= E 


= E 


/= 

I Nm 

Lr=i 


^sN^-e+i r-i' 


1 - — 
m 


m 


+ n 

f=m+l 


I- -0-f 


(“) / m \ [ f M\ 

^ (m-V ‘-('-ft) 




E[Nj 


nGfl 


/=m+l 


+ n qf 

' f=fh+l 

+ n (1 — Gffi) 


(46) 


= i/>(q,m), 


(47) 


where is the (random) number of users requesting files with index less than or equal to m, (a) follows 
from Jensen’s Inequality, and (h) because E[Nm] = nJ2y=i Qf = nGm- 


Appendix C 
Proof of Theorem[2] 

First, notice that since the users decoder Xu{-) operate independently, the rate of the optimal scheme 
R* (n, m, M, q) is non-increasing in n. In fact, an admissible scheme for n users is also admissible for 
any n' < n usersj*^ 


'^To see this, simply add n — n' virtual users to the reduced system with n' users, generate the corresponding random i.i.d. 
demands according to q, and use the code for the system of n users, to achieve the same rate, which is clearly larger or equal 
to the optimal rate for the system with n' users. 
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The first step of the proof consists of lower hounding the rate of any admissible scheme with the optimal 
rate of a genie-aided system that eliminates some users. By construction of the genie, we can lower hound 
the optimal rate of the genie-aided system hy the optimal rate over an ensemble of reduced systems with 
binomially distributed number of users, reduced library size, and uniform demand distribution. Finally, we 
lower bound such ensemble average rate with a lower bound on the optimal rate in the case of arbitrary 


(non-random) demands, by using a result proven in |20|, which we state here for convenience, expressed 
in our notation, as Lemma 

Fix G {1, • • • , m] and consider the following genie-aided system: given the request vector f, all users 
u gU such that fu>isre served by a genie at no transmission cost. For each u gU such that /„ < I, 
the genie flips an independent biased coin and serves user u at no transmission cost with probability 
1 — ^, while the system has to serve user u by transmission on the shared link with probability ^. We 
let N denote the number of users that require service from the system (i.e., not handled by the genie). 

. In fact, any user u has probability of requiring service 


requires service|f„ = f)qf 

/=! 

e 

requires service|f„ = f)qf 

/=! 


It is immediate to see that N ~ Binomial(n, 
from the system with probability 

P(ri requires service) = 


^ Qe 

= 


(48) 


(49) 


where (481 follows from the fact that, by construction, P(rt requires service|fu = /) = 0 for / > t’. 
Notice also that iqe < I, since we have assumed a monotonically non-increasing demand distribution q, 
and if iq^ > 1 then qj > l/i for all 1 < / < £, such that X]/=i 9/ > which is impossible by the 
definition of probability mass function. 

Now, notice that the optimal achievable rate for the genie-aided scheme provides a lower-bound to 
the optimal achievable rate R*{n,m, M,q) of the original system. In fact, as argued before, the genie 
eliminates a random subset of users (which depends on the realization of the request vector and on the 
outcome of the independent coins flipped by the genie). We let M,q) denote the optimal 

rate of the genie-aided scheme. Furthermore, we notice that in the genie-aided system the only requests 
that are handled by the system are made with uniform independent probability over the reduced library 
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{1,..., f\. In fact, we have 


IP(fn = fW requires service) 


= f, u requires service) 
F{u requires service) 

P(u requires servicejfu = f )qf 
iqi 


= 


If ’ 


for / e 


(50) 


0 for f>i 

\ 

It follows that, for a given set of users requiring service, the optimal rate of a system restricted to those 
users, with library size equal to £, and uniform demand distribution, is not larger than the optimal rate of 
the genie-aided original system. Moreover, by the symmetry of the system with respect to the users, this 
optimal rate does not depend on the specific set of users requesting service, but only on its size, which 
is given by N, as defined before. Consistently with the notation introduced in Section]^ for any N = 
this optimal rate is denoted by R*{N,£, M, {l/£, ..., 1/^))- Then, we can write: 




■^genie(^! ^5 qi) 


n 

= ^R*{N,i,M,{l/i,...,l/i))F{N = N) 

N=1 

n 

> ^R*{N,i,M,{l/e,...,l/£))F{N = N) 

N=r 

> R* {r,£,M, {!/£,...,l/£))F{N>r) 


(51) 

(52) 


where ( |5T] ) holds for any 1 < r < n, since the summation contains non-negative terms, and where (52i 
follows again by the fact that the optimal rate is non-increasing in the number of users. 

A lower bound on R*{r,£,M, {l/£,... ,!/£)) can be given in terms of the lower bound (converse) 


result on the optimum rate for a shared link network with arbitrary demands (see Lemma 3 in |20|). 
This is given by the following: 

Lemma 3: Any admissible scheme achieving rate R{r, £, M, {l/£,..., l/£}) for the shared link network 
with r users, library size £, cache capacity M, and uniform demand distribution {l/£,..., 1/^} must 
satisfy 

/ A/r \ 

(Z > z), (53) 


R{r,£,M,{l/£,...,l/£})>z\l-j^^ 


for any z = {1,..., £}, where Z is a random variable indicating the number of distinct files requested 
when fhe random demand vector is i.i.d. ~Uniform{l, ...,£}. □ 
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Using Lemma in (521 we have 


M 


R*{n, m, M, q) > 2 ;P (N > r) P(Z > z) ( ^ “ |7J 


(54) 


Next, we further lower hound the two prohahilities P (N > r) and P(Z£ > z) and find the range of the 
corresponding parameters. To this purpose, we recall the definition of self-hounding function: 

Definition 3: Let C M and consider a nonnegative z^-variate function g : —)■ [0, oo). We say that g 

has the self-hounding property if there exist functions gi : X'^~^ —)■ M such that, for all (xi,..., Xy) G X’^ 


and alH = 1,..., z^. 


and 


0 < g{xi,--- ,x„) - gfixi,--- • • • ,Xu) < 1, (55) 

V 

^ {g{xx, ■■■ ,xfi)- gfixx, • • • , Xi_i, Xi+i, • • • , x^)) < g(xi, ■■■ , x^). (56) 

i=l 

0 


The following lemma 159| yields a concentration property of random variables expressed as self- 
hounding functions of random vectors. 

Lemma 4: Consider C M and the random vector X = (Xi,..., X^,) G X'^. Let Y = ^(X) where g{-) 
has the self-hounding property of Definition Then, for any 0 < /z < E[Y], 


P(Y — E[Y] < —/i) < exp ( — 




2E[Y] 


(57) 

□ 


Next, we observe that g{x \,..., Xy) = self-bounding when its argument is a binary vector 

(i.e., X = {0,1}). Hence, N satisfies Lemma and we can write 


-—I 

2E[N]y ’ 


(58) 


P(N > E[N] — /z) > 1 — exp 
with 0 < g < E[N]. Since N ~ Binomial(n,£g^) we have E[N] = niq^. Hence, letting g = E[N] — r in 

{niqc — r)^' 


([SS]) we obtain 


’ (N > r) > 1 — exp I — 


2n£q£ 


= Pii£,r), 


(59) 


for 0 < r < niqi. 

The variable Z defined in Lemma |3 can be written as 

e 

Z = 1 {3 rz requesting file /} . 

/=! 


( 60 ) 
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Although the binary random variables 1 {3 u requesting file /} are not mutually independent, neverthe¬ 
less Z is given as the sum of the components of a binary vector and therefore Lemma applies. In 
particular, we have 




such that, operating as before, we arrive at the lower bound 


P(Z > z) > 1 — exp — 


(E[Z] - z) 


2E[Z] 


= P 2 {i,r,z) 


(61) 


for 0 < z < E[Z]. 


Then, for some £, r, by maximizing the obtained lower bound (541 with respect to the free parameter 
z, we obtain 


m, M,q) > P(N > r) max F{Z > z){z — zM/[£/z\) 

, [minlEpl.r}] } 


Let z G (0,E[Z]], we consider two cases: z >1 and 0 < z < 1, 
1) if z > 1, let r > 1, then 


(62) 


R*{n,m,M,q) > P(N > r) max F{Z > z){z — zM/[i/z\) 

, [minlElZJjr}] } 

(a) 

> P(N > r) max _ F{Z > z){z — zM/[i/z\) 

(b) 

> P(N > r)P(Z > z) max _ {z — zM /[£/ z\), 


(63) 


where (a) is because z < E[Z] and (b) is because that if z < z then P(Z > z) > P(Z > F). By 
using the lower bounds (591 and ([6T]) in (54i, we have 


R*{n,m,M,q)>Pi{£,r)P 2 {i,r,T) max _ {z — zM/[i/z\). 

,rmin{2,r}]} 

2) if 0 < F < 1, then let r = 1, z = 1 and, using (591 and ( [M] ) into (54 1 , we have 

R*{n,m,M,q} > P(N > 1)P(Z > 1)(1 - M/^) 

= P(N > 1)P(Z > F)(l - M/i) 

> P(N > l)P2(Al,J)(l-M/£) 


(64) 


(b) 

> Pi{£,r)P 2 {£,l,P){l-M/£), (65) 

where (a) follows from observing that Z is an integer, so that P(Z> 1) = P(Z>F) when 
F G (0,1). Similarly, (b) holds also for r > 1, since in this case P(N > 1) > P(N > r). 
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Therefore, taking the maximum of (64) and (65), we obtain 


-R*(re, m, M, q) > max < Pi{£,r)P 2 {£,r, z) max (z — )1{2;, r > 1}, 


Pi{£, r)P2{£, 1, S0(1 - M/£)1{z G (0,1)} [> . 
Maximizing over £, r, and 'z, we obtain ( [T^ in Theorem 

Appendix D 
Proof of Theorem[3] 

Letting m = m and using Lemma [T] we obtain 


( 66 ) 


M,q,m) = min | ^1 — 


-u-“y 

rre / 


(“) f / rre \ / / nM 


< mm {n \ 1-,-l,rre 

m J M 


, rre 


rre 

TV “ 1) 

M 


r rre r 

< mm|—- l,rre,rej , 


(67) 


where in (a) is because that (1 — x)'^ > 1 — rex for x < 1. The proof of Theorem follows by showing 
that min { ^ — 1, rre, re} is order-optimal. 

In the following, we will evaluate the converse shown in Theorem and compute the gap between 


R^*’(re, rre, M, q), given in (19), and R'^^(re, rre, M, q, rre) to show the order-optimality of RLFU-GCC by 


?ub 


appropriately choosing the parameters £, r, z, z. Specifically we choose: 

£ = rre, 

r = 5(1 — a)re. 


z = am (1 — exp ( —5(1 — a) — 


re 


rre 


( 68 ) 

(69) 

(70) 


where 5 G (0,1) and a G (0,1) are positive constant independent of the system parameters rre, re, M, and 
determined in the following while z will be determined later according to the different value of rre, re, M. 
Note that am (l — exp (—5(1 — a)re hence by definition z <r. We now compute each 

term in ( [T^ individually]^ To this end, using ( [6^ and ( [69| ) we first find an expression for n£qi and 
£ (l — (l — })^) in terms of 5, a, rre, re and a. 


^°In evaluating 119 ', anytime that the value of 2 or diverges as m —^ oo, we ignore the non-integer effect without 
mentioning. 
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Specifically, by using Lemma we can write 


niqi = nm 


m 


H{a, 1, m) 

1 — Ct 

nm 


> —i 

—=—777^”^ — —-— -I- 1 

> (1 — a)n + o(n), 


and 


niqi = nm 


m 


H{a, 1, m) 

1 — Q 

nm 


from which it follows that 


= (1 — a)n + o(n). 

niqi = (1 — a)n + o(n). 


Furthermore, using (691, we have 


1 - 11 -^ 


^ \ <5(1—Q:)r^^ 

= m I 1 - 1 1-) 

mj 


(71) 


(72) 


(73) 


= m (l — exp ^—<5(1 — a)—+ o (jn — exp ^—(5(1 — a) — 


(74) 


Then, by using 


( |70l ), ( [74| ) and (73) in ( [20] ) and ( |2T] ), we obtain 

{niqi — 5(1 — a)n)‘^' 


Pi{£,r) = l-exp|^- 

= 1 — exp ( — 


= 1 — exp j — 

= l-o(l), 


2niqe 

(((1 — a)n + o{n)) — 5(1 — a)n)^ 
2((1 — a)n + o(n)) 

((1 — 5)(1 — a)n + o{n)Y 
2(5(1 — a)n + o{n)) 


(75) 


and 


P2(^,r,5) 

= 1 — exp j 

(a) 

> l-o(l) 


(£(i-(i-4n-g)^ \ 


( 76 ) 
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where (a) follows from the fact that 

= 0 (m (l - exp (-«{1 - a)^))) 

Thus, hy using Theorem we obtain 

R^^(n, m, M, q) 

(a) 

> Pi{i,r)P2{i,r,T) max {z — zM/[i/z\) 

> (1 - o(l))(l - o(l)) max (z - zM/[i/z\) 

>(l-o(l))2 (z- zM/[i/z\), (77) 

ze{i,-,r21} 

where (a) is because z < r. 

In the following, we consider two cases, namely n = u]{m) and n = 0{m). For each of these two 
cases, we treat separately the sub-regions of M illustrated in Fig. and Fig. respectively. 


* — < M = o(m) 
2a - ^ ^ 


n = ui 



M < 


1 


, TTi 

r — > 3 
2M - 


M = 0(m) 4 


2M 


< 3 


Fig. 8. The sub-cases of the regimes of M when n = uj(m). 


A. Region of n = u:{m) 


In this regime, using the fact that — 


a;(l), the expression of z, given in (701, reduces to: 


z = am (1 — o(l)). 


(78) 
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1 

2(7(1 — 


< M = o{m) 


f - > 3 

IM - 


^ —(5(1 — a) > 1 \ 

m. 


; = 0(m) ^ 


M = 0(m) 4 


V M < 


V 2M 

1 

2(7(1 — 

1 m 


< 3 


/' -777^-7— < M = o{m.) 

ad(l — a) n 


^-^(l-a)<l< 


M = S(m) 


M < 


1 771 

(7(5(1 — a) 71 


{ 


^(7(5(1 — a)n 


> 3 


^(7(5(1 — a)n 


< 3 


Fig. 9. The sub-cases of the regimes of M when n = 0{m). 


Consequently, ( [77] ) can be rewritten as: 

E}^{n, m, M, q) > (1 — o(l))^ 


max (z — zM/[£/z\). 

2e{i,-,r^m{i-o(i))l} 


1) When 2 cr(i-o(i)) < = o{m): Letting z = |_^J, from (79) we obtain 


m, M, q) > (1 — o(l))^ [ z— 


LtJ 


M 


= ( 1 - 0 ( 1 ))' 


/ 


V 


m 


jn_ 1 

L2MJ 

V2M- 


m 


M 


L 2M J 


from which using ( |67| ) we have: 

R^^{n, m, M, q, m) 


< 


^ - l + ol 

f m\ 

km) 

1 

(1-0(1))2| 

!i6 + »' 

(*)) 


= 4 + o(l). 


2) When M = Q(m): 


(79) 


(80) 


(81) 
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If ^ > 3, letting z = L^J, from (791 we get 


E}^{n,m,M,q) > (1 — o(l))^ 


/ 


V 


m 


[m\ 


■M 


> ( 1 - 0 ( 1 )) 


L2mJ 


\2M 


2M 


+ o{l) 


■ m _ 1 

> (l-o(l))M 2L+o(l)), 


from which, using ( [67| ), we obtain 

7?“^(n, m, M, q, in) 


S -1 + »(S) 


m, M, q) “ 


< 6 + o(l). 


• If ^ < 3, letting z = 1, from (79), we obtain 


R^^{n, m, M, q) > (1 - o(l))^ ( 1 - , 

m J 


from which, using ( |67| ), we have 

5-l + °(5) < ” + „(i) < g + „(!) 

R'k(n,m,M,q) - (1 - <,(1))2 (1 - M) - j,, + 

3) When M < 2o-(i^o(i)) -' letting z = am (1 — o(l)), from (79), we get 


i?^’’(n, m, M, q) > (1 — o(l))^crm I 1 — -j^ 


M 


[\\ r 


from which, using (671, we obtain 

m, M, q, m) 


R^^{n, m, M, q) 


< 


m 


(1 - o(l))Vm { 1 - ^ 


< 


1 


all- 

' 2[rj 


+ 0 ( 1 ). 


(82) 


(83) 


(84) 


(85) 


( 86 ) 


(87) 


B. Region of n = 0{m) 


1) When ^S(l — a) > 1: z, given in (70), boils down to: 


z = crm (1 — exp ( —(i(l — a) — 


n 


m 


> am (l — e ^) . 


( 88 ) 


If 


2 (g-(i-e-i)) < 3+ = o(m), by letting z = |_^J, R^^{n, m, M, q) is given by (80) and consequently 


using ([67]), we have 


R-\n,m, M, Cl, ih) ^ M " 1 + ^ (t) 

R^^in,m,M,q) - (1 _ o(l))2 (^ + o (^)) 


(89) 


































49 


If M = 0(m), by letting z = when ^ > 3 and letting z = \ when ^ < 3, and by using 

(83 1 and ( [85] ), respectively, we conclude that 

m, M, q, m) 


m, M, q) 


< 6 + o(l). 


If M < 2 (g-(i-e-i)) ’ lading z = am (l — e ^), by using (|6^-(|70|) and (77 1 , we have 
m, M, q) > (1 — o(l))^cr (l — e~^) m I 1 — 


M 


cr(l-e-q 


from which using ( |^ , we obtain 

7?“*’(n, m, M, q, fh) 
R^^{n, m, M, q) 


< 


m 


< 


(1 — o(1 ))2(T (1 — e“i) m I 1 — 

1 


M 


+ 0 ( 1 ). 


a(l-e-i) ( 

2 (l-e-l). 


2) When —6{1 — a) < 1: z boils down to: 


z = am (1 — exp (-<5(1 —a 

' ' m 


n 


(a) 

> 


.,m(+{l-a)-l(+(l-a))') 


1 


= a5{l — a)n -< 7 —<5^(1 — o;)^ 

2 m 

(b) I I 

> <7(5(1 — a)n-<7-r7-r5^(l —a!)^n 

2 <5(1 — a) 

= < 7 ( 5(1 — a)n — ~ 


(90) 


(91) 


(92) 


(93) 


where (a) follows from 1 — e ^ > a: — (b) is due to the fact that ^<5(1 — a) < 1 . 

• If ^ < M = o(m), by letting 2 ; = |_^J, i('’’(n, m, M, q) is given by (80) and finally. 


using (891, we obtain 


7?“°(n, m, M, q, m) 
72*^(n, m, M, q) 


< 4 + 0(1). 


(94) 


If M = 0(m), letting 2 ; = when ^ > 3 and letting 2 = 1 when ^ < 3, and by using (831 
and ( [85] ), respectively, we conclude that 

m, M, q, fh) 


72*^(n, m, M, q) 


< 6 + 0 ( 1 ). 


(95) 
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^ a5{i-a) - 2 ; = ^a6(l — a)n, and using (|M])-(|70l) and (77 1 , we have 


1 


M 


7?‘°(n, m, M, q) > (1 — o(l)) — a)n 1- 

m 

^a5{l—a)n 

Recalling that by assumption ^6{1 — a) < 1 and a < 1, we have that t 


a)n 


(96) 

> 2. We hence 


consider two cases. For > 3, using (|67]) and (|96|), we obtain 


R^“{n, m, M, q, m) 
m, M, q) 


< 


n 


(1 — o{l))‘^ha6{l — a)n ( 1 — 


M 




< 


(l-o(l))V5(l-a) 

V 5-5(1-“)" 

2 

(1 — o(l))^cj(5(l — a) f 1 — 


2-5 




< 


(1 - o(l))2cr(5(l - a) (1 - ^ 


(1 — o(l))2cj(5(l — a )' 


while for 2 < T 


— ^aS{i-a)n ^ *^7) and (961, we obtain 

R^^{n, m, M, q, rh) 


R^^{n, m, M, q) 


< 


n 


(1 — o{l))‘^ha6{l — a)n ( 1 — 


M 


^(T5(l-Q:)n 


< 


< 




(1 - o(l))2cj<5(l - a) (^1- 2 

2 

(1 - o(l))2cj<5(l - a) (l - I) 


(1 — o(1))2(T(5(1 — a) 


(97) 


(98) 


Appendix E 
Proof of Theorem[5] 

In the regime considered by Theorem the number of users is much larger than the library size. For 
simplicity, we write n = pmR. 
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From Lemma [T] we have that: 

>ub 


m 


m, M, q, m) = min < ( — — 1 



n Ga 


+ {1 — Gf_ 


\ n, m 


(99) 


from which, under the condition that p>l, letting m = m, we have 


m 


r m\ 


R'^ {n,m,M,q,m) < + 

In this case, the converse and order-optimal results follow, with minor changes, the same procedure as 
in the case of n = w(m“) shown in the proof of Theorem (see Appendix D-Al. The non-trivial case 


is when 0 <p< 1, where we have the following regions: 0 < M < 1, 1 < M < ^ = i, M > All 
the suh-regions of M are illustrated in Fig. [T^ and will he treated separately in the following proofs. 


/ 0 < M < 1 




1 < M < - 

n 


1 

P 


M> - = - •( 

n p 


r M = o{m) 

M = 0(m) ■< 


m 

r — > 3 
M 


m 
^M- 


Fig. 10. The sub-cases of the regimes of M when n = pm“. 

For the remainder of this section, in evaluating ( [T^ , anytime that the value of either z or [min{z, r}] 
diverges as m —)> oo, we ignore the non-integer effect without mentioning. 


A. Region of 0 < M <1 


In this case, hy using the second term of (991, we can obtain 

m, M, q, m) < m < + (1 — Gm) n = 2p°m -F o{p^m). 


( 100 ) 
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As in the proof of Theorem we compute the converse using Theorem and appropriately choosing 
the parameters r, 'z, z. Specifically, we choose: 


i = m = p<^m, 

6(a — 1) i 

r =- pam, 

a 

~ (^ ( (5(a-l) 1 

z = am 1 — exp-po 

V V a 


( 101 ) 

( 102 ) 

(103) 

(104) 


z=\_z\, 

with 0 < J < 1 and 0 < u < 1 being positive constants to he determined in the following, and such that 
z < r. Next, we compute each term in ( [T^ individually. To this end, using ( 101| ) and ( |102| ) we first find 
an expression for niqi and l{l — {l — in terms of 6, a, m, n and a. 

Using ( |101| ), Lemma and the fact that n = pm^, we can write: 


niqi = n ■ m 


(my 


H{a, 1, m) 
p^m 


> 

1 ^l-a _ 1 1 

1—a 1—a 

a — 1 


a 


and 


niq£ = n ■ m 


-p<^m + o (m) 

(m)~“ 
H{a, 1, m) 

p^m 


from which we have 


a — 1 


a 


< —i— 

1—a^ 2 1—a 

= (a — l)p°m + o(m), 

p^m + o (m) < ntqi < {a — l)p^m + o (m) 


where (105) and (106) follow from the fact that a > 1. 


Using ( 102 ), and the fact that (1 — 1/x)^* e 'i' as x —)• oo, we have: 

1' 


^ 1 - 1 - 


i 


= m I 1 - I 1- 

m 


S(a-l) 


pam 


5{a-l) 

= m ( 1 — exp (---p° 


a 


+ o(m), 


(105) 


(106) 


(107) 


(108) 
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from which, hy using (20), (21 1 , (107), and ( 108[ ), we obtain 

( 

Pi{(.,r^z) = 1 — exp 


S(a-l) 1 \^\ 

—- -po, m ’ 


V 


2n£q£ 




> 1 — exp 


V 


fa-1 i I / \ <5(0-1) i 

y—P‘-m + o{m) - ^ ^ p^mj 

2 ^(a — l)p^m + o (m)^ 




= l-o(l), 


and 


P2{i,r,P) 


= 1 — exp — 




= 1 — exp 


2£(i-(i-in j 

^ (^(1 - a) (m (^1 - exp + o{m 


^ 2 - exp j j m + o(m 

= 1 - 0 ( 1 ). 

Hence, replacing ( |101| )-( T04l ), (109) and (1101 in ( [T^ , we obtain: 

R^^{n, m, M, q) 

> Pi{£,r)P 2 {i,r,'z) max {z — zM/[i/z\) 

> (1 - o(l))(l - o(l)) max (z - zPl/[i/z\) 


>(l-o(l))2 I- 


zM 

III 


> (1 — o(l))^cjm ( 1 — exp ( —^pc 


> (l-»(l))V^1 /'pii(Q -l) 


\ 

1_^ 

1 

^ gd(a-l) J 

( 


a 


a 


1 - 


1 




where (a) is because 1 — exp(—x) > x — = for x > 0. 


(109) 


( 110 ) 


m, (111) 
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Using ( 100 1 and ( |111[ ), we obtain 

R^^{n, m, M, q, m) 


E}^{n, m, M, q) 


< 


2po^m 


(l-o(l))V 


po i5(a-l) _ 1 / p° (5 (q-1) 
a 2 



1 - 


m 


yS(a-l) 


(1-0(1))2 1- 






(a) 

< 


( 112 ) 


( 1 - 0 ( 1))2 1 - 


cr(5(Q—1) 1 / 5 (q;—1) 


cr( 5 (Q!-l) 


where (a) is because p < 1. (112i shows the order-optimality of the achievable expected rate. In fact, it 
is easy to find values for the parameters a, 5, G (0,1) such that the right-hand side of ( 111) is uniformly 
bounded with respect to n,m and M, as shown in the following example: Let a = 5 = ( 5 )^- Then 


aS = which replaced in (112i yields 


m, M, q, m) 
m, M, q) 


< 


(«-i) _ 1 (Ilk f («-i) V 

2a 4 ( 2 I y 2a J 


1 - 




(113) 


Since > 2 for a > 1, then the right hand side of (11131) is a positive constant. In the following 


(a-l) 


proofs, for brevity, we do not illustrate the values of the constant parameters. 


B. Region of I < M < — = 


Let m = p c M c m, by using (991 and Lemma we obtain 

2p^m 


M / \ V m 


n Ga 


+ (1 ~ Gfn) n 


+ O 


p<^m 


(114) 


\M^-^ j 

Following the same procedure as in Appendix E-A[ we use Theorem to compute the converse. All the 
parameters of Theorem are summarized in the following: 


£ = m = 

6(a — 1 ) , ^ 1 ^ 1 

r =- A 1 a p‘>m, 

a 

2 ; = cr ( 1 — exp (- -M » p° 

z=[z\, 


m 


(115) 

(116) 

(117) 

(118) 
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with the constant parameters 6 G (0,1) and a G (0,1) to he determined in the following, and such that 
z < r. Based on (|115|)-([TT8l), we now compute each term in (19) individually in detail. 


Using (115) and Lemma and following the same procedure as in ( |105| ) and ( 106 1, we can write 

(119) 


(O — 1) l-g \ n / \ 1-° i 

-M p^m + o{m) < niqi < [a — 1)M p'^m + o{m). 


a 


which replaced in (20), similar to ( |109| ), gives: 

Pi{l,r,z) > 1 - o(l). 


Furthermore, using (116), we obtain 




= m I 1 - I 1- 

m 


»-i) 


M- 


pam 


, . / — 1) , i 

= ml — exp- -M o po. 


a 


Replacing ( |117 1 and ( |121| ) in ( [2T] ), as in ( |110[ ), we obtain: 


( 120 ) 


+ o (m) 


( 121 ) 


P2(^,r,J) = 1-0(1). 


( 122 ) 
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Finally, replacing ( |115 )-( 118 ), (121 1, and (122 1 in ( [T^ , we obtain: 

R^^{n, m, M, q) 

> Pi{i,r)P 2 {£,r,T) max {z — zM/[i/z\) 

> (1 - o(l))(l - o(l)) max (z - zM/[i/z\) 


z- 


zM 

III 


>il- 0 {l)fz 1- 


M 


m _ 


= (1 - o(l))V { 1 - exp { - 

/ 

1 - 


5(a — 1) i , , 

M o p<=‘ \\m 


a 


\ 


M 


1 - 


Mapam 


- 1 

fj ^ 1—exp ^M 


o(l))^o- 1^1 - exp 

1 


a 


M 


\ 

Mi 


1 


y exp ^—M Q j 

> (1 — o(l))^cT — exp f ) ) m 


/ 


\ 


1 - 


M 








--1 


> (1 — o(l))^fT ( 1 — exp ( — 


5{a — 1) 1 


a 


M o pc" m • 1 — 


= (1 - o(l))V 1 - exp - 


5{a — 1) , i 


a 


> (1 — o(l))^(T ( 1 — exp ( — 




a 


M 


M o pc" m 1 — 


M o p° m 1 — 


M 

-1 

a5(a-l) 

1 


a 

1 

aS{a—l) 

M 

1 

) 


cr5{a—1) 


-1 


where ( 123 1 follows from the fact that since 1 < M, 1 < a, <5 G (0,1), and a G (0,1), we have 

S{a - 1) A ^ adia - 1) i 


1 — exp — 


a 


-M—p^ < 


a 


-M o pc < (Mp)°‘ . 


(123) 


( 124 ) 
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On the other hand, since 


<5(a — 1) i-ct i 

1 — exp (- -M ° 


a 


> 


5(a — 1) , i 1 /(^(a — 1) , 


- 

a 2. 


a 






using (1231, we obtain 

/2*^(n, m, M, q) > (1 — o(1))^(T^ 


3_1 


a 


from which, using ( |114 i, it follows: 

R'^^{n, m, M, q, fh) 


R^^{n, m, M, q) 


< 


1 - 


2M <» p<^m 


C75(q:—1) 


- 1 


m, 


(1 - 0 ( 1 ))%! ) (1 - 

4a 


(1 — o(l))^(T(5(a — 1) (1 — 


,,4(0-1) 


-1 


m 


4 ( 0 - 1 ) 


-1 


which shows the order-optimality of the achievable expected rate of RLFU-GCC. 


(125) 


(126) 


(127) 


C. Region of M > ^ 
In this regime, let in 


m. Using (991, we obtain 


i?"’’(n,m,M,q,m) < ^ - . (128) 

In order to prove the order-optimality of the expected rate achieved by RLFU-GCC, as before, we use 
Theorem to compute the converse. All the parameters of Theorem are summarized in the following: 


£ = m, 

6p(a — 1) 

r = - m, 

a 


z = am 1 — exp — 


5p{a — 1) 


a 


z = 


{LwJ’l}’ otherwise 


max 


(129) 

(130) 

(131) 

(132) 


with the constant parameters 6 £ (0,1) and a £ (0,1) to be determined in the following, and such that 
z < r. We now compute each term in ([T^ individually in detail. 
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Using ( 129 1 and following the same steps as in ( fTT] ) and (72), we have 

a — 1 


a 


-pm + o (m) < n^q^ < {a — l)pm + o (m). 


from which, using (20), similar as in ( |1 10[ ), we obtain 

Pl{^,r) = l-o(l). 


Furthermore using (130), we can write 


<11-11-1)) = m l-(l-i 




, , , Spia — 1)', , 

= m ( 1 — exp (--- ) ) + o [m) 


a 


(133) 


(134) 


(135) 


from which, using (211 and ( |131| ), we have 

= (136) 

Replacing ( |129| )-( [T32l ), ( |134| ) and ( |136| ) in ( [T9| ), we obtain 

72*^(n, m, M, q) > Pi{l,r)P 2 {(-,r,P) max {z — zM/\J!,/z\) 

,\V\} 

> (1 — o(l))^ max {z — zM/\l/z\), (137) 

with 2 and z given in ( 131 1 and ( 132[ ), respectively. Note that, from ( 131 1 , it follows that 'z is lower 
bounded by: 

5p{a — 1) 
a 


z = am 1 — exp — 


> am 


5p{a — 1) 1 ( Sp{a — 1) 

2 


a 


a 


a6p(a — 1) 

> - m 

2a 

a6{a — 1) m 
2a M’ 


> 


(138) 


where ( |138| ) follows from the fact that, in this regime, by assumption M > ^ = and consequently 

pm > M- 


Now we consider two cases of M: M = o(m) and M = Q{m) (see Fig. 111. 

































1) When M = o{m): From ( 132 i, 
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z = 


5a{a — 1) m 

^ M 


from which, using (1381, it follows that z < 2 ;. Hence, using (1291, (1371, and footnote 20 we obtain 


m, M, q) > (1 — o(l)) 


(a) 


25a{a — 1) m 


( 


(b) 


where in (a) we have used the fact that 


2a 

M 

5a{a — 1) 

m 

2a 

M 

da(a — 1) 

m 

2a 

M 

da(a — 1) 

m 


\ 


1 - 


M 




M 


2a 

S<7{a—1) 


M 


M 


2a 


- 1 


Sa(a-l) m 
2a M 


in (h) the fact that M > 1. Then, using (1281, we obtain: 

m, M, q, m) ^ — 1 + o(l) 


m, M, q) 




^(c-1) 


-1 


< 


(1 - (1 - 




-1 


+ 0 ( 1 ). 


2) When M = 0(m).' Then 


If ^ < 3, from ( |132| ), we have that z = 1. Hence using ( 129 1 and (137 1 , we have: 

R^^(n, m, M, q) > (1 - o(l))^ ( 1 - —^ , 

\ my 


from which 


^-1 + 0 ( 1 )) 


R^°{n, m, M, q, m) 

R^^{n,m,M,q) (1 _ o(l ))2 (l _ M) 

m , , 

< 3 + 0 ( 1 ). 


(139) 


\ (5(T(a—1) 

= + o("^) and that ^ 00 , and 


(140) 


(141) 


(142) 


If ^ > 3, from ( |132| ), we have that z = Hence using ( |132[ ) and (137) and the fact that 
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M = 0(m), we obtain 

m, M, q) > Pi{£,r)P 2 {i,r,T) max {z — zM/[£/z\) 


/ \ 


>( 1 - 0 ( 1 ))^ 




= ( 1 - 0 ( 1 ))^ 


m 

LW 


m 

LZM 

m 

Lw 

1 / m 


1 - 


1 - 


1 - 


M 

m 

L^J 

M 

m _ 

2M ^ 


> (l-o(l))3- f— -l) 

- ^ 2 \2M ) 


from which using (1281, we have 

m, M, q, m) 
m, M, q) 


2M 

f-1 + 0(1)) 

1 + 0 ( 1 )) 

(l-«(l))HQ-f) 

< 12 + o(l). 


< 


(143) 


(144) 


Thus, we finish the proof of Theorem 


Appendix F 

Proof of Table HIIin Theorem [6] 

In this section, we provide the proof of Table in Theorem where we assume n,m ^ oo and 
n = o{mP). Table [il] considers the regions 0<M<1, 1<M<^, and M > ^, excluding the 
sub-region {1 < M < n {M = Kn“-i} whose order-optimality is analyzed in Appendix G and 
whose corresponding order-optimal results are provided in Table B Except for the region 0 < M < 1, 
we further consider the subregions illustrated in Figs. [TT] and [T^ treated separately in the following 
proofs. 


A. Region of 0 < M <1 


In this case, we want to prove that RLFU-GCC with m = is order optimal. To this end, using (991 
and Femma|^ , we can write the rate for RFFU-GCC with fh = as: 


m, M, q, m) < m -I- (1 — Gm) n = 2n<^ + o{n‘>). 


(145) 
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/ 1 < M = , 



M = o(m) 


M = Qim) 

= Kim 


Ki 


1<M <— \ M = 0 
n 




(n-) 




= M < 


see Appendix G 
M = o{m) 

M = e(m) 


{ 


1>2 

Kl 


= Kim I — <2 

Ki 


Fig. 11. The sub-cases of the regimes of M when 1 < M < where k, ki are some constants which will be given later. 


Next, similarly as in Section E-A we evaluate the converse replacing in Theorem the following 
parameters: 

£ = m = , 

6{a — 1) 1 

r =- 71 “ , 

a 

_ a6(a — 1) i 
Z = -77“ , 

a 

z=[z\, 

with 0 < 5 < 1, and 0 < cr < 1 positive constants determined in the following and such that z < r. 
After some algebraic manipulations similar to the ones conducted in Section |E-A we obtain 


777, M, q, m) 
R^^{n, 777, M, q) 


< 


2771 


2 




l_ 1 


(146) 




which shows the order-optimality of the achievable expected rate. Eq. ( 146| ) proves that in this regime 
(0 < M < 1 small enough), caching cannot provide large gain, or the gain of caching is at most additive 
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M > 



n 


= a;(l) 


’ M = o{m) 


M = 0(m) 


= Kitn 



1 

Kl 

1 

Kl 


> 3 

< 3 


(T(5(a — 1) 


< 1 




m 


a—1 


0 ( 1 ). 


a6{a — 1) n 




(T(5(a — 1) m 
a M 


< 2 


aSla — 1) m 

^- --T. > 2 

a M 


Fig. 12. The sub-cases of the regimes of M when M > where ki, a, S are some constants which will be given later. 


SO that M cannot affect the order of the expected rate. 


B. Region of 1 < M < — 


and tc (n°-i ) = M < — (see Fig. 


In this regime of M, we have three cases to consider, which are 1 < M = o j, M = 0|^n“- 

111 . 

1) When 1 < M = o j.- This case further splits in two scenarios: M = o{m) and M = 0(m) 
(see Fig. [TT| ). 

• If M = o{m), letting m = M^n^, using (991 and Lemma [^, after some algebraic manipulations, 
we obtain 

nGrf,\ 




m 


M 


M 


m 


+ (1 — Gffi) n 


2 M-n- 

<-h O 

- M 


' 1 1 
Mo.no. 

M 


(147) 


Next, we prove the order-optimality of the expected rate achieved by the proposed scheme. Following 
the similar steps as before, we use Theorem]^ to compute the converse. The parameters required in 
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Theorem are summarized in the following. 


i = m = M, 

6{a — 1) - i 

r =- -M o n° 


a 


a5ia — 1) , ^1^ i 
z =-;- -M ° n“, 


2a 


z= [z].^ 


(148) 

(149) 

(150) 

(151) 


with 0 < 5 < 1, and 0 < d < 1 positive constants determined in the following. Note that hy 
definition z < r < in. Next we compute each term in ( [T^ individually. To this end, using ( |148| ) 
and ( |149| ), we first find an expression for niqe and £ (l — (l — |)^) in ferms of 6,a,m,n and a. 
Specifically, using ( 1481 and Lemma we have 


niqi = n ■ m 


{m)~ 


> 


H{a, 1, m) 

1 —a 

nm 


1 77jl-a_1_L 1 

I—a 


I — — 


m 


a—l 


—\1 
1—a 1—a 


777,a-l 


l-g 1-a rv—^ 

nM « n « m 

a -1 

a—l a—l 


a — l 


a 


M o, n<»+o(M ° n 


and 


nlq^ = n ■ m ■ 


[m] 


< 1 


H{a, 1, m) 
nrh^~°‘ 


1 —Q 




nm 


a—l 


—— 
a-l”^ a-1 


a—l 


J_ ( rn Y 
)i—l y m+1 J 

= {a — 1)M~n'^ + o 


from which 


(152) 


(153) 


o n° + o o no^ < niqi < (a — 1)M » n^ + o o ^ . 


( 154 ) 
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Next using (1481 and (149) , we have 


^ 1 - 1 - 


= ml 1-1- 


m 




= M n “ ( 1 — exp ( — 


6{a — 1) 
aM 


+ o(M°n»(l — exp ( — 


Then, hy using (201, (1491, and (1541, we obtain 

/ 

Pi {£, r) = 1 — exp 






2n£qe 


> 1 — exp 


o, ric + o (jll o no.'^ — M = n^.^ 


V 


2((a —1)M c n^+o(M c n 


= l-o(l). 


while using (211, (150) and (155), we have 

P2{£,r,I) 

= 1 — exp — 


2 ^( 1 -( 1 - 1 )^) 




where (a) follows from 


Ha-1) 


2a 


<^ 1 - 1 -- < 


1 




M a nc 


a 


6{a — 1) 
aM 


(155) 


(156) 


(157) 


(158) 


with ( |158| ) derived from ( |155[ ) using 1 — exp(—x) > x — ^ for x > 0, and 1 — exp(—x) < x for 
X > 0. 

Finally, replacing Eqs. ( 148 )-( 151 ), ( 156 ), and ( 157 ) in Theorem]^ and using footnote 20 we obtain 
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R^^{n, m, M, q) 


> Pi{£,r)P 2 {£,r,T) max {z — zM/[i/z\) 

> (1 - o(l))(l - o(l)) max (z - zM/[i/z\) 

ze{i,-,\z]} 


>{l-o{l))^z 1- 


M 

III 


>(i-o(i)y 

>(l-o(l))2 

= (l-o(l))2 
>(l-o(l))2 


,a6(a — 1 ) , ^ 1-0 1 

' ^ 'M~n^ 


2a 


1 , 

b ! 

1 ) 

2a 


a6{a — 

1 ) 

2a 


a6{a — 

1 ) 


M n c 


1 - 


1 - - 


M 


Mana 


’-M^nc 


M 


_ Mana _ _ 1 


M o n° 1 — 


2a _L 

(T(5(q:—1) M 


2a 


from which, using ( |147[ ), we obtain 

P'^^{n, m, M, q, in 
72*^(n, m, M, q) 


< 


< 


2a 


cr(5(a—1) 

2Mini 

+ 0 1 

M 

(j5(ck—1) 

(- 

2a 

2 + o(l) 

f75(a—1) 

(l- 


M n<^, 


(M«) 


aS(c-l) 




T5(a-1) 


-1 


(159) 


(160) 


where a,S £ (0,1) can he chosen accordingly (possibly a function of a) such that 




1 — 


“3S- 

■5(0-1) ■ 


is a positive constant, which shows the order-optimality of the expected rate. 

If M = 0(m) = Kim + o{m), where 0 < ki < 1 is a given constant such that 1 < M < ^ and 


1 < M = o(no-i), using (99l and letting m = m, we obtain 


P'^'^{n,m,M,q,m) < (^-l)(l-(l-^) )+(l-Gm)n 


m 


M 


M 


m / m\ 


m 


nGp 


(161) 


Following the similar steps as before, we use Theorem to compute the converse. The parameters 
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appeared in Theorem are summarized in the following. 

£ = rh = m, 

5{a — 1) n 


r = 


a—1 ’ 




a m 
a5{a — 1) n 


a m' 


a—1 


z = max • 


( 162 ) 

(163) 

(164) 

(165) 


with 0 < 5 < 1, and 0 < cr < 1 positive constants determined in the following. Note that hy 
definition z < r. As before, we compute each term in ( [T^ individually. To this end, using ( |162| ) 
and ( |163| ), we first find an expression for niq^ and £{l — {l — jY) in terms of 6,a,m,n and a. 
Specifically, using ( |162| ) and Lemma following similar sfeps as in ( |105 1 and ( 106 1, we obtain 

(166) 


(a — 1) n 
a m' 


from which, using (20) and ( |163[ ), we obtain 

/ 

Pi{i,r,z) = 1 — exp 


<?(«-!)n 


V 


2n£qi 




> 1 — exp 

= l-o(l). 


/ _ L q( ri ^ _ ^(a-1) n \^\ 




2(a-l)^+»(sifer)) 


On the other hand, using ( |148| ) and ( |149| ), via Taylor expansion, we obtain 


1-11-^ 


= m I 1 - 1 1 - — 
m 


6{a — 1) n 


a m' 


Q —1 


+ o 


n 


m 


0.-1 j ’ 


from which, using ( [2T] ) and ( 164 ), after some algebra, we have 

P2{£,r,I) 

' YY-Y-\Y)-^f' 


= 1 — exp — - 


= 1 -o(l), 


2£{i-{i-\y) 


where (a) follows from the fact that 




= 0 


n 


a—1 


m 


(167) 


(168) 


( 169 ) 
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In the following, we distinguish between two cases: ^ ^ > 3 and ^ = -^ < S (see Fig. 

- If ^ ^ > 3, from ( |165[ ), we have: 

m 


z = 


V2M 


from which replacing Eqs. ( |162| )-( [T64| ), ( |167| ), ( |169| ) and ( |170| ) in ( [T^ , we have 

m, M, q) 

> Pi{i,r)P 2 {£,r,^ max {z — zM/[£/z\) 

> (l-o(l))(l-o(l)) max (z- zM/[i/z\) 

ze{i,-,\z\} 


Thus, we obtain 


>(l-o(l))2z 1- 


>( 1 - 0 ( 1 ))^ 


>( 1 - 0 ( 1 ))^ 


m 

LlM 


M \ 

W\) 

( 

1 - 

V 


M 


m 


1 - 


12M 

^ 2 \2M ) 


m 

LW 

1 / m 


1 - 


L 2M J 

M 

m _ 2 

2M ^ 


72“°(n, m, M, q, m) 
72**’(n, m, M, q) 


< 


m 

M 


2 

“ I _ M 
2 m 

2 

< T3T = 12, 
2 3 


proving the order-optimality of the achievable expected rate. 
- If ^ ^ < 3, from ( |165[ ), we have: 


11 ). 


(170) 


(171) 


(172) 


z = 1 


( 173 ) 
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from which replacing Eqs. ( |162| )-( [T64| ), ( |167| ), ( |169| ) and ( |173| ) in ( [T^ , we have 

m, M, q) 

> Pi{£,r)P 2 {i,r,T) max {z — zM/[i/z\) 


> ( 1 - 0 ( 1 ))^ ( 1 --). 

m J 


(174) 


Thus, we obtain 


m, M, q, m) 
m, M, q) 


< 


m 1 

M ^ 


(1-»(!))“ (1-") 

m /-I _ M \ 

^ M m) 

m 


(175) 


2) When u;(n“-i) = M < ^ (see Fig. [i7|/or a reminder): In this case, letting 


m = M, we have 




(176) 


Next we consider two regimes: M = o(m) and M = 0(m). 

• If M = o(m), using (176) and Lemma we have 

72“'’(n, m, M, q, m) < + o . (177) 

To compute the converse, in this case we use the second term of ( [T9] ), with the parameters given 
hy: 

I = cM, (178) 

S{a — l)c^“" 


r = 


-nM 


1 —a 


a 


(179) 

(180) 


z = a, 

and c > 1, 0 < (5 < 1, and 0 < cr < 1 positive constants determined in the following. Next we 
compute each term in ( [T^ individually. To this end, using ( |178 ) and ( 179 ), and recalling that rh = M 
we first find an expression for niqi and £{l — {l — in terms of <5, a, m, n and a. Specifically, 
using ( |148| ) and Lemma [^ we have 


n£qi = n • cM 


{cuy 


H{a, l,m) 

—1—— —i— -1- 1 

l-a'"' l-a ^ ^ 

(a — l)c^““ 


> 


-nM 


1 — 0 . 


+ o(nMi-"), 


a 


(181) 
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and 


\ —a. 


n (cM) 

nige = n-cM ■ ——--- 

H (a, 1, m) 

n{cM)^~° 


T^im + iy 

= {a- + o (nM^"") 


\1—a _ _ 


1—a 
fl—a 


= 1 — exp 


> 1 — exp 


^ (^n£qi — — nM^ ^ 


V 


2nlqi 




2 ((a - l)ci-“nMi-“ + o (nMi-“)) 


V 




\ 


/ 


while using (211 and ( |180| ) we have 

^2(^,1,^ = 1 -expl-^^*^—^ 


(182) 


from which 

{a-iy "" nM^-^ + o(nM^-") < n% < (a - l)c^““nMi-“ + o . (183) 

a 

Using ( [ 20 ] ), ( |179| l, and ( |183| ), we obtain 

Pi(^,r) 


(184) 


(185) 


Then, replacing ( |178 1 -( I 8 O 1 , ( 184 i , and ( 185 1 in the second term of ( [T9l ), we obtain 

m, M, q) 

> Pi(^,r)P2(^,l,5)(l-M/£) 

> (1 - 0 ( 1 )) + o(nMi-“)) - exp 

> (1 - 0 ( 1 )) - exp (1 - + o(nM^-“) 



M \ 


( 186 ) 
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from which using ( |177 1, we obtain 

72“'’(n, m, M, q, m) 

72"’(n, m, M, q) 


< 


nM^ " + o{nM^ “) 


(l-o(l)) 1-exp 




< 


1 — exp 




(1-^) 


(l - o(nMi-“) 

+ o(l), (187) 


1\ (1—5)2(a—l)ci" 


2a2 


which shows the order-optimality of the achievable expected rate for RLFU and RAP. 

If M = 0(m) = Kim + o(m), with 0 < ki < 1, using \\16) and Lemmawe have 

72“’’(n, m, M, q, m) < (kJ““ — l) nm^““ + o (1) 

= (l-K5 *“^)nM^-" + o(l). (188) 

To compute the converse, similar as before, we use the second term of ( [T^ . The values of the 
parameters i, r, and J in (19) will be different based on the fact that ^ ^ > 2 and ^ ^ < 2 

(see Fig. [TT] ). 


Kl 


- When ^ ^ > 2, the parameters i, r, and z in (19) are given as in ( |178| )-( [T80l ) with the 

additional constraint that 1 < c < 2 to guarantee that i < m. Following the same steps as in 


the case of M = o(m), we obtain (186), from which, using (188) we obtain: 
72“’’(n, m, M, q, m 


72"’(n, m, M, q) 


< 


1 


1 — exp 


(1-0- 


(1-D 


1\ (1-^)^(q-1)c‘ 


- + o(l), (189) 


2a 


which shows the order-optimality of the achievable expected rate for RLFU and RAP. 
- When ^ ^ < 2, the parameters i, r, and z in (19) are given by: 

£ = m, 


a — 1 


r = 


^5nM 


a 


1—a 


z = 


(190) 

(191) 

(192) 


with 0 < J < 1, and 0 < cr < 1 positive constants determined in the following. Next we 
compute each term in ( [T9| ) individually. Specifically, using ( |190| ) and Lemma recalling that 
m = M we have 


n£q£ = n ■ m 


m 


> 

—h 1 

1—a 1—a ' 

a — 1 


H{a, 1, m) 

1— (y. 

nm 


+ o(nM^-"), 


l—a\ 


a 


(193) 
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and 


\ —a 


0 ("i) “ 

niqt = n • m • ——--- 

ti (a, 1, m) 

nM^~^ 


/ ±—a. 


Thus, by using (193) and (194), we obtain 


“—+ o(nMi-“) < < {a - + o(nM 

a 


1—a't 


from which, using ( |1911 l we 

Pi(Ar) 


have: 


a i ) 


( {nlqi - 

= 1 — exp — ^;— 

y 2neq£ j 

{ (1-(i)2(a- 

>l-exp(-5^- j 

- 2a2 2 V 2a2 


1—a 


: have 


(194) 


(195) 


Furthermore using using ( [2T] ) and ( 192 ), we 

P 2 {i,l,z) = 1 - exp j , 

from which replacing ( |190[ )-( fT9^ , ( |196| ) and ( |197| ) in the second term of ( [T^ we obtain 

m, M, q) 

>Pi{i,r)P2{e,hI){l-M/£) 


(196) 


(197) 


( 198 ) 
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Thus, by using ( |188[ ) and ( |198| ), we obtain 

m, M, q, rh) 


R^^{n, m, M, q) 


< 


(1 - nAfi-" + o(l) 




l-exp((ly£))(l-M) 


< 




exp ^ 


I _ 


< 


(l-<5io)^(a-l) 1 / 1 _ / (l-o-io)" 

2a2 2“-i \ ^ \ 2 


Note that 


2__A£ 


1-^ 


< 


1 _ 

+^(1)- 


1, a < 2 

a — 1, a > 2 


from which using (|199|), and (200|), we obtain 


R^^{n, m, M, q, m) 


< 


max{l, a — 1} 


R^^{n,m,M,ci) - (i-^)^(«-i) i (. 

2o,2 2“-i 2 J J 

which shows the order-optimality of the achievable expected rate for RLFU and RAP. 


(199) 

(200a) 

(200b) 

( 201 ) 


C. Region of M > 


In this case, we want to prove that RLFU-GCC with m = m is order optimal. To this end, using (991, 
we can write the rate for RLFU-GCC with in = m as: 


R'^^{n,m,M,q,m) < 


m 


M 


M 


m 


nGa 


+ (1 ~ Gffi) n 


m 


< TT-l + O — . 


M 


m \ 


MJ 


( 202 ) 


At this point, we distinguish between two cases: —^ = a;(l) and —^ = 0(1) (see Fig. 11). 
































73 


1) When ^2-1 = To compute the converse, again, we evaluate the first term of (191 in Theorem 

with the parameters defined as: 


= m, 

6{a — 1) n 


r = 


a m' 


a—1 ' 


Z = 


a5{a — 1) n 


a m' 
aS{a-l) m 


a—1 ’ 


(203) 

(204) 

(205) 


z = 


M = o{m) 


(206) 


a M 

M = e{m) 

with 0 < (5 < 1, and 0 < cr < 1 positive constants determined in the following. Note that hy definition 

z < r. Next we compute each term in ( [T^ individually. To this end, using ( |203[ ) and ( |204| ), and recalling 

that in = m we first find an expression for niqi and £ (l — (l — 1)*^) in terms of 6,a,m,n and a. 

Specifically, using (2031 and Lemma following similar steps as in ( fTT] ) and (72 1 , we have 

a — 1 n 
a \m'- 


T. f n \ ^ ^ f 'n \ 


(207) 


from which, replacing ( |204| ) and ( |207[ ) in (201, we have: 

/ ( ("-!) 

Pii^,r) > 1 — exp 


n _ _ (5(q—1) n 


V 

'=’ 1 - 0 ( 1 ). 


2{(«-l)sifer+o{^)) 


(208) 


Furthermore, using (|203|) and (|204|), via Taylor expansion, we obtain 


<11-11-^ 


= m I 1 - I 1 - — 

m 


H°-i) 


5{a — 1) n 
a 

from which, replacing ( |205| ), and (2091 in ([2T]), we have: 


/ n 

a-l + ° I ’ 


(209) 


P2{i,r,7) 


> 1 — exp 


/ f &{a-l) n \ o( ^ n V 

I a m°‘~^ ' ^ ) a J 


V 


( <?(«-!) _L ( n 


= 1 - 0 ( 1 ). 


( 210 ) 














































Thus, replacing ( |208 1 and ( 210 ) in the second term of ( [T^ , we obtain 

R^^{n, m, M, q) 
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> Pi{£,r)P 2 {£,r,T) max {z — zM/[i/z\) 

,\z]} 

> (1 - o(l))(l - o(l)) max (z - zM/[i/z\) 

max. z{l-M/[£/z\). 


( 211 ) 


Next, we find an explicit expression for max^gji... z (1 — M/[£/zJ). Using ( 2061 , after some 
algebraic manipulations, we have 

'a6(a — l) / —1)\ m fm\ ^ 


max z I 1 — 


M 

¥N\ 


> 


l\/f \ TY) 

1-, M = 0(m) and — < 3 

my M 

M = e(m) and > 3, 


(212b) 


(212c) 


where ( |212a| ) follows from footnote 20 and from the fact that M > ^, while ( \212c} follows from the 
fact that 2 ; = |_^J > ^ - 1 and [m/z\ > ^ - 1. 

Replacing 212 in ( 211 1 and using ( |202[ ), after simple algebraic manipulations we have: 

+ 0 ( 1 ), M = o{m) 


it"*" 


< 


(jS(a—l) 


( 1 -^) 


7Tl Tfh 

^+ 0 ( 1 ) < 3 + 0 ( 1 ), M = 0(m) and — <3 


m 


“ 0 ( 1 )) + 0 ( 1 ) < 12 + 0 ( 1 ), M = 0(m) and — > 3 

12 m> 


2) When = 0(1) (see Fig. Since M > ^, we have that ^ and M = 0(m). To 

compute the converse, similar as before, we use ( [T^ in Theorem with all the parameters given by: 

(214) 

(215) 


£ = m, 

6{a — 1) n 


r = 


z = < 


z = < 


a m' 

cr, 

a5{a—l) _ n 

a m“' 

1 , 


a—1 ’ 


aS(a-l) _2T_ 

a m°‘~ 


< 1 


aS{a-l) n ^ 

rr < 1 


(216) 


a. 

a5(a-l) 


-I cr5(a-l) ri crS{a-l) m ^ o 

-‘-J av rr,<»-l ^ ^ A/f ^ ^ ’ 


a m 

m I cri5(a-l) 


L*J. 


’ a M 

I ^ C7<5(a—1) m \ o 

a M — ^ 


(217) 
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with 0 < (5 < 1, and 0 < u < 1 positive constants determined in the following. Next, we compute 
each te 

(T5(a—1) 


each term in (191 individually. In doing this, we consider to further regions 


_i < 1 and 


^ > 1 (see Fig. 


111 . 


a m 

. If < 1, using pi4| ) and Lemma following similar steps as in ( fTT] ) and ( [72l ), we have 


a — 1 n 


a m'^ 


from which using (201 and ( |215| ), we obtain 

/ 

Pi{i,r) = 1 — exp 


/ n \ „ , , n f n \ 

T + o (- T ) < niqi < (a — 1)-r + o (-r ) 


(218) 


<?(«-!)n 


\ 

( /(«-!) 


2n(.qg 




> 1 — exp 


<?(«-!)n 

a m' 


W + o(- 


2((a-l)^+o(^)) 


= 1 — exp 

V 

(«) / 

> 1 — exp — 


({1-5) 


(«-i) 




2((«-l)^) 
2a2 


(219) 


where (a) follows from the fact that 0(1) = ^ > 1- Furthermore, using using ( [21] ) and 

( 216| ), we have 

( 220 ) 


P 2 {^, 1,5) = 1 - exp 1^- 

Replacing ( |214| )-( |2T^ , \2\9) and ( |220| ) in the second term of ( [T^ , we obtain 

m, M, q) 


>Pi{£,r)P 2 {£,l,I){l-M/£) 
{l-5)Ha-l) 


> 1 — exp — 


2a2 


M\ 

1 -— , 

m J 


( 221 ) 


from which using ( |202[ ), we have 

R'^^{n, m, M, q, m) 


^-1 + 0(1) 


R^'°{n, m, M, q) 




< 


(a) 

< 


1 


m 


fT(t(Q:—1) 

^ _ 

(l-<5)2(a-l) 

2a2 


M 


+ o{l) 


1 — exp — 


+ o(l), 


( 222 ) 
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where (a) is because that ^ ^ —pr. 

^ ^ M — m“ ^ — ao(a—l) 


< 


, jf using (214) and (215), and following the same procedure as (158), via Taylor 


expansion, we obtain 


01-11-^ 


where (a) is due to — ©(I)- Hence, replacing (214), (215), and (218) in (20), we obtain: 


= m I 1 - 1 1- 

m 




(a) 5{a — 1) n 


a m'- 


zii + o(l)> 


(223) 


Pi(£, r) = 1 — exp 


= 1 — exp 


(a) 


S{a-l) n 


\ 


2n£qi 

\ 


((1-J) 


/ 


(l-5)2(a-l) 


a 


2 a 2 a5(a — 1) 
2aa5 


> 1 — exp 

= 1 — exp 

where (a) is because that ^ vvhile replacing (216) and ( |223| ) in (21 1 , we obtain: 


(224) 


P2{i,r,7) 


> 1 — exp 




= 1 — exp — 


o ( <5(a-l) n I n ( ’^ ) 1 

y a ° j y 

(1 — a)'^6{a — 1) n 


2a 


m 


a—1 


> 1 — exp 1^— 
= 1 — exp ( — 


(1 — a)‘^5{a — 1) a 


2a 

2o- 


a£i{a — 1 ) 


(225) 


Replacing (214l-(217i, ( 224 ), and ( 225| ) in the first term of ( [T^ , and noticing, from ( 216 1, that. 
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when 


cr(5(Q:—1) 


^ > 1, by definition, z < r, we obtain 


R^°{n, m, M, q) 

> Pi{£,r)P 2 {i,r,T) max {z — zM/[i/z\) 


> 1 — exp — 


2aaS 


1 — exp — 


(1 -0') 
2 a 


max z{l-M/[e/z\). 

ze{i,-,\z\} 


(226) 


Replacing (214l and (217i in (2261, we have 


/ M 
max z [1- 


> 


ad{a - 1) m ^ ^ 
my a M 

\ ( ra \ , . a6(a — 1) m 

-( -l)+o(l), 


,2 \2M 

from which replacing ( |227 1 in (2261 and using (202 1 we obtain 

2 a 


> 2 






< 


a6{a — 1) ^1 — exp 


2oicrS 




(227a) 

(227b) 


®xp \ 2aaS 


l-exp(-^i^ 


1 aS{a—l) 

2 


+ o(l), (228a) 


+ o(l), (228b) 


2a 


where (228a I holds for 


aS{a-l) 2 n 

a M 


< 2, while (228b I holds for 


cr5{a-l) m 


——> 2. Note that p27a| ) 


follows from lower bounding z by ^ — 1, while ( |228[ ) shows the order-optimality of the achievable 
expected rate for RLFU and RAP. 


Appendix G 

Proof of Tabfe HIT] in Theorem[6] 

In this section, we prove the order-optimality of RLFU for the case when n = o (m“) and the memory 

Q 1 1 1 Q, 

is such that 1 < M < ^ and M = 0(n“-i) = + 0 ( 71 °-^) < ^ for some positive constant k. 


In this case, we have two cases to consider, which are k> 1 and ic < 1 (see Fig. 13). All the subregions 


of M are illustrated in Fig. 13 and will be treated separately in the following proofs. 


In the following, since we focus on the asymptotic regime (n —)• 00 ), in M = -|- o(n“-i), we 

1 1 
ignore o(n“-i) and we write directly M = 

A. Region of k < 1 


In this case, we need to consider the scenarios M = o{m) and M = 0(m) separately (see Fig. 13) 
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rn < 


^M = o(m) ■< 


K > 


ri(a - 1) 


r]{a — 1) A 


H 


1 < M < - 

n 

and ^ 
M = 0 

1 

= 


r]{a ~ 1)\ “-1 


— > 2 
_ Ki 

M^eim)J I /?;(«-1)\^ 


= Kim I I 

Ki 


{ 


< 2 


<«<!•< 


— > M 

Ki 


2 < — < fj. 
Ki 


K>l| 


M = o{m) 


M — 0(m) 


{ 


— > 2 
Ki 


— HLiiri I — <2 
Kl 


Fig. 13. The sub-cases of the regimes of M when 1 < M < and M = 0(n“-i), where k, ki, rj are some constants 
which will be determined later. 


1) When M = o{m): Within this region, we need to consider two subregions again, k < ^ 

and K > ' (see Fig. § , where r] = aS with 0 < cr < 1 and 0 < 5 < 1 positive constants 

determined later. 


If K < ( ^ , letting m = M<^n^ = using (99l and Lemmaj^ we have 


m, M, q, m) < 


2M<»n“ fMcn<^\ i i , , 

+ o I 1 = 2 k ^-^ - 1 + o(l). 


(229) 


M \ M 

To compute the converse, similar as in previous sections, we use TheoremThe parameters required 
in Theorem are summarized in the following. 


LI. L 1 

£ = m = M‘^n‘> =K°n“-i, 

5ia — 1) i_i 

r =-Kc, 

a 

_ a5{a — 1) i_i 

Z = - 


a 


z= [z], 


(230) 

(231) 

(232) 

(233) 


with 0 < 5 < 1, and 0 < cr < 1 positive constants determined in the following. Note that hy definition 
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z < r. Next, we compute each term in ( [T^ individually. To this end, using ( 230| ) and ( |231[ ), we first 
find an expression for and £ (l — (l — in terms of 6 ,a,m,n and a. Specifically, using 
( 230 1 and Lemmafollowing similar steps as in ( |152 i and ( 153 1, we have 


a — 1 


a 




+ o(l) < niqi < (a — 1)k“ ^ + o(l) 


while using (2301 and (2311, via Taylor expansion, we obtain 


^1-1- 


= m 1 - 1 - 


m 


S(a — 1) 1 _i , 

^- -K^ +ol. 

a 


Replacing ( |230 l-( 232 l, ( 234 1 and ( 235 1 in (201 and in ( |2T] ), we obtain 

/ 

-Pi r) > 1 — exp - 

V 



s (5(a—1) - 

^ +o(l) „ 


l) 

2| 

({a - 1)ko“^ + o (1)'] 

1 1 


(1 - 5 ) 2 ( 0 - 1 ) 

= l-exp(--ICO 

{l-6na-l) 

2a2 


+ 0(1) 


(a) 

> 1 — exp — 


and 


P 2 {i,r,T) > 1 - exp ( - 


(1 — (T)2(a — 1)5 


2a 


(234) 


(235) 


(236) 


(237) 


where (a) in P36|) follows from ignoring o(l) and from the fact that k < 1. Finally, replacing Eqs. 
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( 230 l-( 233 1, ( 236 1 and ( 237 1 in Theorem using footnote 20 we obtain: 

m, M, q) > Pi{i,r)P 2 {i,r,T) max {z — zM/[£/z\) 


> H max \z\ 1 — 


(“) ^ / M 

> E-z\l- 


M 


m 

L^J 


LfJ 


“ a 


„ (jb{a — 1) i_i 

> 

a 


1 - 


M 




1 - 


M 


(b) „ cj5(q; — 1) i_i 

- • __ UC n 

a 


1 - 


a5{a—l) 

1 


_ 1 


„ cr5(Q; — 1) i_i 
a 


1 - 


t _ i_ 

a5{a—l) M 

1 


(T(5(«—1) 




where in (a) we have defined 


H = 1 - exp - 


(1 — (T)^(a — 1)(5 
2a 


1 — exp — 


(1-J£(a^ 

2a2 


(238) 


while (b) follows from the fact that M = . Finally using (2291 and (238), we obtain 

m, M, q, fh) 


E}^{n, m, M, q) 


< 


cr5(a—1) 
a 


1 - 




^5(0-1) 


(239) 


If «: > 


\ letting 

DUb/-^ _ 


KU- 


J using 

< 

(1- 

Gih)n 


n 


< 





-1 


'§ 


n 


l—a 




(240) 


21 ~ A 

It can be easily seen that if we let m = M n <», then the achievable expected rate is also order-optimal. Here we only 
_ 1 

illustrate the case of m = M = Kn°‘-^. 
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To compute the converse, similar as before, we use the second term of ( [T^ in Theorem where 
the required parameters are summarized in the following: 


i = cM, 

6{a — 


r = 


z = a, 


a 


(241) 

(242) 

(243) 


with l<c, 0<<5<1, and 0 < cr < 1 positive constants determined in the following. Next, we 
compute each term in ( [T^ individually. To this end, using ( 241 1 and Lemma following the same 
steps as in ( 181 1 and ( |182[ ), and recalling that M = we have: 

{a — 


a 


+ o(l) < nlqi <{a- + o(l) 

from which using ( |241| ) and ( [20] ) we have: 

m.r) > 1 - exp 

(1 — 6 )‘^{a — l)c^““ 


(244) 


> 1 — exp ( — 

Moreover, using (|2T]) and ( 243 1, we have 


(1-^r 


P 2 {(., 1 ,T) = 1 - exp 

from which replacing p41| )-( |243] ), p45| ) and p46| ) in the second term of ( [T9| ) we obtain: 

i?'‘’(n,m,M,q) > Pi(£, r)P 2 (A 1,5)(1 - M/^) 


(245) 


(246) 


> 1 — exp — 


Using ( 240| ) and ( 247| ), we obtain 

R^^{n, m, M, q, fh) 


(1 — 5)'^{a — l)c 


1 — 0 . 


2a2 


1 — exp 


(1-^r 


1 

1 - - 
c 


(247) 


^l—a 


R^^(n, m, M, q) 


< 


(a) 

< 


1 — exp 




(1-^) 1-exp - 




2q2 


{jS(a—l) 


1 — exp 

where (a) follows from the fact that k > 




(l-a l-exp - 


(l-(5)2(a-l)ci 
2a2 


,(248) 


rj(a-^\ <»-i 


with r] = ad. 
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2) When M = Q{m) = Kim + o{m) = ku'^-^ + o(n“-i), with 0 < ki < 1 and 0 < k < I/ In this 
case, we distinguish between two cases: ^ ^ < 2 and ^ ^ > 2 (see Fig. 


131. 


• If ^ ^ > 2, we need to consider, two further cases: k < 


(^) 


and K > 




(See Fig. [13] ). 
- Case K < 




same procedure adopted in Section 


_ 1 

: In this case, letting m = M= Ko.n°‘-^, and following exactly the 
when M = o{m) and k < 


G-Al 


we prove 


the order optimality of RLFU-GCC with m = M = K°n“-i. 


- Case K > 




: For this scenario we consider two further suh-cases: — > a and 

^vf 


— < u (See Fig. 131, where u > 2 is an arbitrary positive constant. 

1) When ^ > /X, letting m = M or fh = and following exactly the same procedure 

when M = o(m) and k > \ we prove the order 


adopted in Section 


G-Al 


optimality of RLFU-GCC with m = M or m = Man°. Note that, in this case, differently 


from the case when M = o(m) the constant c = £/M given in ( 241 1 is constrained to take 
values in 1 < c < /X in order to guarantee that £ < m. 

2) When A — letting m = M or m = M and following exactly the same procedure 

when M = o(m) and k > we prove the order 


adopted in Section 


G-Al 


optimality of RLFU-GCC with fh = M or in = Note that in this scenario, in 

order to guarantee that £ < m, recalling that ^ ^ > 2, the constant c given in ( 241 1 is 

constrained to take values in 1 < c < 2. 

Moreover, in this case (^ < p), we can prove that UP-GCC is also order-optimal. In the 
following, we derive the converse and the order-optimality of UP-GCC in this regime. Using 
and Lemma we have 


m, M,q,m) < ^ — 1 + o(l) = —— 1 + o(l). 
M Kl 


(249) 


To compute the converse, we use the second term in ( [T^ , where the required parameters 
are summarized in the following: 


£ = m, 
a — \ 


r = 


a 


K 


a—1 


z = a. 


(250) 

(251) 

(252) 


with 0 < (5 < 1, and 0 < cr < 1 positive constants determined in the following. Next, we 
compute each term in ([T^ individually. To this end, using ( |250| ), Lemma and the fact 
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that, by assumption, M = Kim = Kn^-^, following the same steps as in ( [TT] ) and (721, we 
have: 


a — 1 /Ki' 


a \ K 


+ o(l) < n£qi < (a - 1) + o(l), 


from which using (201 and ( |2511 ), we obtain 

-Pi (£? r) > 1 — exp 

(<^) / 

> 1 — exp — 


(1 — 6)‘^{a — 1 ) 


2a2 


K 


(1 -(5)2(a- 1) /I 


2a2 




a—1' 


where (a) is because ^ while using (21 1 , and \151) we have 


P 2 {£, 1 ,z) = 1 - exp 


[l-af 


(253) 


(254) 


(255) 


Replacing ( |250 l-( 252 ), ( 254 i and ( 255 1 in the second term of ( fTO] ) given in Theorem]^ we 
obtain 


i?"’(n,m,M,q) > Pi{£,r)P 2 {£,r,P){l - M/1) 


‘S’ 

m J 


(P 


(256) 


where in (a) we have defined E as: 


H = 1 - exp - 


(l-(5)2(a-l) fl 


m _ 1 


a—1' 


1 — exp 


{l-aY 


while in (b) we have used the fact that ^ ^ > 2. From ( |256| ) using (249), we have 


R^'°{n, m, M, q) 


(1-i) 


2fj, 


(257) 


where in (a) we have used the fact that ^ < /r. 


If ^ ^ < 2 (see Fig. 13) letting m = m, and using (99), we have 


R'^'°{n, m, M, q, m) < ^ - 1 + o(l) = 


1 + 0 ( 1 ). 


(258) 
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To derive the converse, as before, we use the second term of ( [T^ given in Theorem where the 
parameters r, 2 are given as in P50| )-( |252l ). Following the same steps adopted in Eqs. P53| )-( |25^ , 
we obtain 


m, M, q) 


< 


(1-i) 


(“) 2 
< 


(259) 


where in (a) we have used the fact that — < 2. 


B. Region of k> 1 


In this case we need to consider two further sub-cases: M = o{m) and M = 0(m) (see Fig. 131). 


1) When M = o{m): Letting m = M = Kn°-^ + o(n“-i), by using (99) and Lemma|^ we obtain 


< 


n 


= K 


1 — 0 


+ o 


n 


M' 


0—1 


+ 0 ( 1 ). (260) 

To compute the converse, similar as before, we use the second term of ([Tg) in Theorem where the 
required parameters are summarized in the following: 

£ = cM, 

6 {a — l)c 


.,1—o _1—o 

hj 


r = 


a 


(261) 

(262) 

(263) 


z = a, 

with c>l,0<()<l, and 0 < ct < 1 positive constants determined in the following. Next, we compute 

each term in ( [T^ individually. To this end, using ( 261 1, Lemma and the fact that, by assumption, 

1 1 

M = + o(n“-i), following the same steps as in (|181|) and (|182|), we have: 


(a — l)c 


1—a 4-1—a 


a 


+ 0 ( 1 ) < n£qi < {a — l)c 


1—Q ,A—<y. 


+ 0 ( 1 )) 


(264) 


from which using (20) and (262), we obtain 

Pi{£,r) 




2a2 


W (1 - (5)2(a - l)c^ " 1 f {1 - 6)‘^{a - l)c^ " 

- 2^2 ^ "2 1 ^^(2 


( 265 ) 
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where (a) follows from the fact that 1 — e ^ > x — Furthermore, using (21 1 and (2631, we have 


P 2 {l,l,z) = 1-exp - 


(1 -cj)^ 


(266) 


from which replacing (2611-(263), ( |265| ) and ( |266[ ) in the second term of ( [T^ , we obtain: 


> =•(!-- 


(267) 


with 


1 ({l-5Y{a-l)c 

2a2 2 1“ 


,1—Q \ 2 




1 — exp 


(1 - ay 


Using (2601 and ( 161) , we obtain 


m, M, q, m) 
7?*’’(n, m, M, q) 




< 


(a) 

< 


-(1-D 


(l-5)2(a-l)ci- 

2«2 


1 ( (l-(S)^(a-l)ci- 


2a2 


1 — exp 


(l-a) 


■))(!-« 


where (a) is because k > 1. 

2) When M = 0(m) = Kim + o{m) with k > 1: In this case, we distinguish between two cases: 
^ ^ < 2 and ^ = P- > 2, where 0 < ki < 1 (see Fig. 


131. 


• If ^ ^ > 2, letting m = M and following exactly the same procedure adopted in Section 


G-Bl 


we prove the order optimality of RLFU-GCC with m = M. Note that in this scenario, differently 
in order to guarantee that i < m, recalling that ^ = P > 2, the constant c 


from Section 


G-Bl 


given in (241 1 is constrained to take values in 1 < c < 2. 
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If ^ ^ < 2, leting m = M, using (99) and Lemma we have 

R'^^{n,m, M,ci,m) < (1 —Gm)n 


= n 


< n 


f=fh+l 

H{a,m + 1, m) 


H{a, l,m) 

!_ l-a _ _|_ J- 

1—a 1—a^ > (m+l)° 


Y^im + l)i“" — 

1—a'' ^ 1—OL 

= n(—+ (^im + 1)^““) + o(l) 
— (k}~“ — l) + o (1) + o(l) 

= (k}““ - l) n +o(l) 

= “+o(i) 

+o(l). 


= K 


(268) 


To compute the converse, similar as before, we use the second term of ( [T9] ) in Theorem where 
the required parameters are summarized in the following: 


i = m, 

a — 1 ^ f 


r = 


a 




z = a. 


(269) 

(270) 

(271) 


with 0 < <5 < 1, and 0 < cr < 1 positive constants determined in the following. Recall here that, 
hy assumption, k > 1 while 0 < ki < 1. Next, we compute each term in ( [T^ individually. To this 
end, using ( |269| ), Lemma and the fact that, hy assumption, M = kti^^ = Kim (see Fig. 13), 
following the same steps as in ( 181 ) and ( 182[ ), we have: 

^ + 0(1) < niqi < {a - 1) +o(l), (272) 


a 


from which using using (20 and P70| ), we obtain 

Piie,r) 

(1 — 6)‘^{a — 1 ) /'Ki\<^-^ 
2a2 


> 1 — exp — 


(^) (1 — (5)2(ct — 1) 1 /(I — 6)^{a — 1) fKi\<^-^ 

- ^ WJ ~ 2 [ 2^;^2 


( 273 ) 
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where (a) follows from the fact that 1 — e ^ > x — ■ Furthemore, using using (|21|) and ( |271[ ), we 

have: 


P 2 (f’, 1,?) = 1 — exp 


(274) 


from which, replacing ( |269[ )-( |27T] ), ( |273| l, and ( |274| ) in the second term of ( [T^ , we obtain 

m, M, q) 




with 


> Pi(£,r)P2(^,l,5)(l-M/£) 
MX 
m y 


(1 — 5)^(a — 1) /KiX"-! 1 /(1 — (5)^(a — 1) 

“2V ^ 


(275) 


1 — exp 


(1-a)^ 


Using ( |268| ) and ( |275| ), we obtain 

P“'’(n, m, M, q, in) 


A—(y 


m, M, q) 


< 


< 


< 


(1 - 




;i-f) 


S 1 _ M 


a—1 


Next note that, using the fact that k > 1 and 0 < < 1, 


^l—a 


< 


(l-(5)2(a-l) 1 1 /(l-,5)2(a-l) 


2a2 


10—1 


2a2 


with (5 selected such that the right-hand side of ( |277[ ) is positive. Furthermore 

a—1 


1 - 


M 


1 - 


< 


M — 


from which, using ( |276 ) and (2781, we obtain 

R'^'°{n, m, M, q, in) 


R}^{n, m, M, q) 


< 


1 , a <2 

a — 1, a > 2 

max{l, a — 1} 


(276) 

{ 111 ) 

(278a) 

(278b) 


(1 - „( 1 )) - 4 (l - exp 







































Appendix H 


Proof of Coroflary[T] 

To show Corollary [T] we follow the same procedure as the proof of Theorem By using Lemma [T] 
it is straightforward to see 

m, M, q, m) = min — 1, m| . (279) 

To evaluate the converse in Theorem we compute each term in ( [T^ individually using i, r, J, z, as 
summarized in the following. 


= m, 


r = 6- 


m 


1 — OL 


-n, 


(280) 

(281) 


H{a, 1, m) 

'z = max{(l — e)m, 1}, (282) 

with 0 < (5 < 1 positive constants determined in the following and 0 < e < ^ an arbitrarily small 
constant. Note that hy assumption, due to the fact that n —)■ oo and m is kept constant, we have z < r. 


Next, we compute each term in ([T^ individually. To this end, using ( 261| ), we have: 


n£qi = n 


m 


1 — OL 


H{a, l,m) ’ 


(283) 


from which, using (201 and using (262 1 , we obtain 


p.«,r) = 1 - exp I - 


2H{a, l,m) 


= l-o(l). 


(284) 


Using (2611 and (262), we have 


1 - 1 - 


= m 1 - 1 - 


I \ "fficl.m)' 


m 


= m(l-o(l)), 

from which using ( [2T] ) and ( |263| ), we have 

P 2 {£,r,Tj = 1 - o(l). 

Replacing P61 1-( 263 1, ( 284 ) and ( 286 ) in ( [T9] ), we obtain 

7?*^(n, m, M, q) 

>(l-o(l))2 max z{l-M/[i/z\) 

ze{i,-,\z\} 

>(l-o(l))^ max z{l-M/[£/z\). 


(285) 


(286) 


( 287 ) 
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A. When M < | 


In this case, letting z = max{(l — e)m, 1}, and using (287 1 , we have 

E}^{n, m, M, q) > (1 — o(l))^ max{(l — e)m, 1} (1 — M). 


Then, hy using ( 119) and (2881, we obtain 

m, M, q, m) 


m, M, q) 


< 

< 

< 


m 


(1 — o(l))2 max{(l — e)m, 1} (1 — M) 
m 


i max{(l — e)m, 1} 


1 - e 


(288) 


(289) 


B. When ^ < M < 1 + e 


If m < 3, letting z = 1, hy using (|287|), we have 


R^^{n, m, M, q) > (1 - o(l))^ ( 1 - —^ . 

m J 


By using (2791 and (2901, we obtain 

7?“^(n, m, M, q, fh) 


< 


m _ 1 

M 


R^^{n,m,M,q) (i _ o(l))2 (l - M) M 


= ^ < 6. 


If m > 3, letting 2 = [yj, and using (|287|, we have 


R^^{n, m, M, q) > (1 — o(l))" 


m 

12 


M 

1 - 

2 


By using \H9) and \191) , we obtain 

7?“*’(n, m, M, q, m) 
RF°{n, m, M, q) 


< 


< 


m 


> 


(l-o(l)P [fj(l-f) 

m 

(f-1) (1-f) 

1 

G-^) (i-f) 

1 

G-i) (1-^) 

12 

1 - e' 


(290) 


(291) 


(292) 


( 293 ) 
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C. When 1 + e < M < f 


Letting z = and using (2871, we obtain 


m, M, q) > (1 — o(l)) 


(a) 


2 _ 
\2M 

2 /"ZL _ 


1 1 - 


M \ 


s y2M 


1 1 -- 


- 1 


where (a) is because that when M > 1 + e, j0ij < Then, by using (|279| and (|294|, we have 

m, M, q, in) 


R^^{n, m, M, q) 


< 


M 


4 

a _ m 

V2 mJ 

< 12 . 


< 


D. When M > f 


Letting z = 1, and using (287 1 , we obtain 


R^^{n, m, M, q) > (1 — o(l))^ { 1 — — 


M 


m 


Hence, using (279 1 and ( 296| ), we have 

m, M, q, in) 


< 


m _ 1 

M 


m 

< ^ < 6 . 


R}^{n,m,M,c{) (i _ o(l))2 (l - M) M 


(294) 


(295) 


(296) 


( 297 ) 
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